PCT 



WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




tMTCDM ATin NAt . ^PUCA-nON PUBUSHED UNDER THE PATE NT C O OP ERATION 



(51) International Patent Classification 7 

C12N 15/31, C07K 14/285, 16/12, A61K 
31/70. 39/102, 39/40, G01N 33/50 



Al 



(11) International Publication Number: 
(43) International Publication Date: 



31 August 2000 (31.08.00) 



(21) International Application Number: PCIYEP00/01423 

(22) International Filing Date: 22 February 2000 (22.02.00) 



(30) Priority Data: 

9904183.2 



24 February 1999 (24.02.99) GB 



nii Applicant (for all designated States except US): SMITHK- 
( } LINE BEECHAM BIOLOGICALS S.A. [BE/BE]; Rue de 
l'lnstitut 89, B-1330 Rixensart (BE). 

^ISSSicants ifor US only, RUELLE, Jean-Louis 

TBE/BE]; SmithKline Beecham Biologicals s.a. Rue de 
l'lnstitut 89, B-1330 Rixensart (BE). THONNARD, Joelle 
[BE/BE]; SmithKline Beecham Biologicals s.a., Rue de 
L'lnstitut 89, B-1330 Rixensart (BE). 

nd\ Ac-ent* PR1VETT Kathryn, Louise; SmithKline Beecham, Two 
(74) ^^^-c^^^, Middlesex TW8 9EP (GB). 



Desianated States: AE, AL, AM, AT, AU, AZ, BA, BB BG, 
(81) »^gnatea ^ ^ ^ ^ ^ ^ ^ DK, DM, EE, 

ES>I, GB, GD, GE, GH, GM, HR, HU, ID, IL IN IS JP, 
. KE, KG, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MA 
MD MG, MK, MN, MW, MX, NO, NZ, PL, PT, RO, RU, 
SD/SE, SG, SI, SK, SL, TJ, TM, TR, IT TZ, UA UG, 
US UZ VN, YU, ZA, ZW, ARIPO patent (GH, GM, KE, 
Ls' MW, SD, SL t SZ, TZ, UG, ZW), Eurasian patent (AM, 
AZ BY KG KZ, MD, RU, TJ, TM), European patent (AT, 
BE! CH, CY, DE, DK, ES, Fl, FR, GB, GR, IE, IT, LU, 
MC NL PT SE), OAPI patent (BF, BJ, CF, CG, CI, CM, 
Ga' GN, GW, ML, MR, NE, SN, TD, TG). 



Published 

With international search report. 

Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 



(54) Title: HAEMOPHILUS ANTIGEN 
(57) Abstract 



The invention provides vaccine -positions uprising BASB^ 
and methods for producing such polypeptides by recombinant techniques. Also prov.ded 



to screen for antibacterial compounds. 



Codes used to identify 

AL Albania 

AM Armenia 

AT Austria 

AU Australia 

AZ Azerbaijan 

BA Bosnia and Herzegovina 

BB Barbados 

BE Belgium 

BF Burkina Faso 

BG Bulgaria 

BJ Benin 

BR Brazil 

BY Belarus 

CA Canada 

CF Central African Republic 

CG Congo 

CH Switzerland 

CI C6te d'lvoire 

CM Cameroon 

CN China 

CU Cuba 

CZ Czech Republic 

DE Germany 

DK Denmaik 

EE Estonia 



FOR THE PURPOSES OF INFORMATION ONLY 
States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



ES 

FI 

FR 

GA 

GB 

GE 

GH 

GN 

GR 

HU 

IE 

1L 

IS 

IT 

JP 

KE 

KG 

KP 

KR 

KZ 

LC 

LI 

LK 

LR 



Spain 
Finland 
France 
Gabon 

United Kingdom 

Georgia 

Ghana 

Guinea 

Greece 

Hungary 

Ireland 

Israel 

Iceland 

Italy 

Japan 

Kenya 

Kyrgyzstan 

Democratic People's 

Republic of Korea 

Republic of Korea 

Kazakstan 

Saint Lucia 

Liechtenstein 

Sri Lanka 

Liberia 



LS Lesotho 

LT Lithuania 

LU Luxembourg 

LV Latvia 

MC Monaco 

MD Republic of Moldova 

MG Madagascar 

MK The former Yugoslav 
Republic of Macedonia 



ML 


Mali 


MN 


Mongolia 


MR 


Mauritania 


MW 


Malawi 


MX 


Mexico 


NE 


Niger 


NL 


Netherlands 


NO 


Norway 


NZ 


New Zealand 


PL 


Poland 


PT 


Portugal 


RO 


Romania 


RU 


Russian Federation 


SD 


Sudan 


SE 


Sweden 


SG 


Singapore 



SI 


Slovenia 


SK 


Slovakia 


SN 


Senegal 


SZ 


Swaziland 


TD 


Chad 


TG 


Togo 


TJ 


Tajikistan 


TM 


Turkmenistan 


TR 


Turkey 


TT 


Trinidad and Tobago 


UA 


Ukraine 


UG 


Uganda 


US 


United States of America 


uz 


Uzbekistan 


VN 


Viet Nam 


YU 


Yugoslavia 


zw 


Zimbabwe 



WO 00/50599 



PCT/EP00/01423 



HAEMOPHILUS ANTIGEN 

FIELD OF THE INVENTION 

This invention relates to methods for the production of polynucleotides, (herein referred to 
5 as n BASB070 M polynucleotide(s)"), polypeptides encoded by them (referred to herein as 
"BASB070" or "BASB070" polypeptide(s)"), and recombinant materials. In another aspect, 
the invention relates to methods for using such polypeptides and polynucleotides, including 
vaccines against bacterial infections. In a further aspect, the invention relates to diagnostic 
assays for detecting infection of certain pathogens. 

10 

BACKGROUND OF THE INVENTION 

Haemophilus influenza is a non-motile Gram negative bacterium. Man is its only natural 
15 host. K influenzae isolates are usually classified according to their polysaccharide 

capsule. Six different capsular types designated a through f have been identified. Isolates 
that fail to agglutinate with antisera raised against one of these six serotypes are 
classified as nontypeable, and do not express a capsule. 

20 The H. influenzae type b is clearly different from the other types in that it is a major 

cause of bacterial meningitis and systemic diseases. Nontypeable H. influenzae (NTHi) 
are only occasionally isolated from the blood of patients with systemic disease. 

NTHi is a common cause of pneumonia, exacerbation of chronic bronchitis, sinusitis and 
25 otitis media. 

Otitis media is an important childhood disease both by the number of cases and its 
potential sequelae. More than 3.5 millions cases are recorded every year in the United 
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States, and it is estimated that 80 % of children have experienced at least one episode of 
otitis before reaching the age of 3 (1). Left untreated, or becoming chronic, this disease 
may lead to hearing loss that can be temporary (in the case of fluid accumulation in the 
middle ear) or permanent (if the auditive nerve is damaged). In infants, such hearing 
5 losses may be responsible for delayed speech learning. 



Three bacterial species are primarily isolated from the middle ear of children with otitis 
media: Streptococcus pneumoniae, NTHi and M. catarrhalis. These are present in 60 to 
90 % of cases. A review of recent studies shows that S. pneumoniae and NTHi together 
10 represent about 30 %, and M catarrhalis about 1 5 % of otitis media cases (2). Other 
bacteria can be isolated from the middle ear {H. influenza type B, S. pyogenes, ...) but at 
a much lower frequency (2 % of the cases or less). 

Epidemiological data indicate that, for the pathogens found in the middle ear, the 
1 5 colonization of the upper respiratory tract is an absolute prerequisite for the development 
of an otitis; other factors are however also required to lead to the disease (3-9). These are 
important to trigger the migration of the bacteria into the middle ear via the Eustachian 
tubes, followed by the initiation of an inflammatory process. These other factors are 
unknown todate. It has been postulated that a transient anomaly of the immune system 
20 following a viral infection, for example, could cause an inability to control the 

colonization of the respiratory tract (5). An alternative explanation is that the exposure to 
environmental factors allows a more important colonization of some children, who 
subsequently become susceptible to the development of otitis media because of the 
sustained presence of middle ear pathogens (2). 

25 

Various proteins of H. influenzae have been shown to be involved in pathogenesis or 
have been shown to confer protection upon vaccination in animal models. 
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Adherence of NTHi to human nasopharygeal epithelial cells has been reported (10). 
Apart from fimbriae and pili (11-15), many adhesins have been identified in NTHi. 
Among them, two surface exposed high-molecular- weight proteins designated HMW1 
and HMW2 have been shown to mediate adhesion of NTHi to epithelial cells (16). 
5 Another family of high molecular weight proteins has been identified in NTHi strains 
that lack proteins belonging to HMW1/HMW2 family. The NTHi 1 15 kDa Hia protein 
(17) is highly similar to the Hsf adhesin expressed by H. influenzae type b strains (18). 
Another protein, the Hap protein shows similarity to IgAl serine proteases and has been 
shown to be involved in both adhesion and cell entry (19). 

10 

Five major outer membrane proteins (OMP) have been identified and numerically 
numbered. 

Original studies using H. influenzae type b strains showed that antibodies specific for PI 
15 and P2 protected infant rats from subsequent challenge (20-21). P2 was found to be able 
to induce bactericidal and opsonic antibodies, which are directed against the variable 
regions present within surface exposed loop structures of this integral OMP (22-23). The 
lipoprotein P4 also could induce bactericidal antibodies (24). 

20 P6 is a conserved peptidoglycan-associated lipoprotein making up 1-5 % of the outer 
membrane (25). Later a lipoprotein of about the same mol. wt. was recognized, called 
PCP (P6 crossreactive protein) (26). A mixture of the conserved lipoproteins P4, P6 and 
PCP did not reveal protection as measured in a chinchilla otitis-media model (27). P6 
alone appears to induce protection in the chinchilla model (28). 

25 

Another fimbrin is described with homology to P5, which in itself has sequence 
homology to the integral Escherichia coli OmpA (29-30). This paradox needs further 
investigation to clarify the nature and role of pilin, pilin-associated proteins, pilin- 
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excreting proteins and P5. It is however shown that NTHi adhere to mucus by way of 
fimbriae. (29). P5 appears to undergo antigenic drift during persistent infections with 
NTHi (31). 

5 In line with the observations made with gonococci and meningococci, NTHi expresses a 
dual human transferrin receptor composed of TbpA and TbpB when grown under iron 
limitation. Anti-TbpB protected infant rats. (32). Hemoglobin / haptoglobin receptors 
have also been described for NTHi (33). A receptor for Haem: Hemopexin has also been 
identified (34). A lactoferrin receptor is also present in NTHi, but is not yet characterized 

10 (35). A protein resembling neisserial FrpB-protein has not been described in NTHi. 

A 80kDa OMP, the D15 surface antigen, provides protection against NTHi in a mouse 
challenge model. (36). A 42kDa outer membrane lipoprotein, LPD is conserved amongst 
Haemophilus influenzae and induces bactericidal antibodies (37). A minor 98kDa OMP 

1 5 (38), was found to be a protective antigen, this OMP may very well be one of the Fe- 
limitation inducible OMPs or high molecular weight adhesins that have been 
characterized thereafter. H. Influenzae produces IgAl-protease activity (39). IgAl- 
proteases of NTHi reveals a high degree of antigenic variability (40). 
Another OMP of NTHi, OMP26, a 26-kDa protein has been shown to enhance 

20 pulmonary clearance in a rat model (41). The NTHi HtrA protein has also been shown to 
be a protective antigen. Indeed, this protein protected Chinchilla against otitis media and 
protected infant rats against H. influenzae type b bacteremia (42) 
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The frequency of NTHi infections has risen dramatically in the past few decades. This 
1 5 phenomenon has created an unmet medical need for new anti-microbial agents, vaccines, 
drug screening methods and diagnostic tests for this organism. The present invention 
aims to meet that need. In particular the present invention aims to meet the need for a 
vaccine effective against NTHi. 

20 

SUMMARY OF THE INVENTION 

The present invention relates to recombinant materials and methods for the production of 
BASB070, in particular BASB070 polypeptides and BASB070 polynucleotides, for use 
25 especially in therapeutic or prophylactic vaccines. In another aspect, the invention relates to 
methods for using such polypeptides and polynucleotides, including prevention and 
treatment of microbial diseases, amongst others. In a further aspect, the invention relates to 
diagnostic assays for detecting diseases associated with microbial infections and conditions 
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associated with such infections, such as assays for detecting expression or activity of 
BASB070 polynucleotides or polypeptides. 

It has been discovered that BASB070 encodes a polypeptide that has the features of a 
5 surface-exposed molecule recognisable by the immune system. For example, the 

polypeptide encoded by BASB070 contains a signal peptide, indicating that it is exported 
at least to the periplasm between the inner and outer membranes of the bacterium. 
Furthermore the polypeptide has similarities to other known surface-exposed proteins and 
potential similarity to other known immunogenic and immunoprotective peptides. 

10 

BASB070 is 26% identical to the HasR protein of Serratia marcescens in an 81 7 amino 
acid overlap. S. marcescens HasR is a receptor for the HasA hemophore protein. It is a 
TonB dependent protein. It has the characteristics of an integral outer membrane protein 
with a p-barrel 3D structure. The p-barrels formed by the integral outer membrane 
15 proteins are composed of anti-parallel, amphipathic p-strands. Their external loops 
contain frequently immunodominant B-cell epitopes. BASB070 is sufficiently closely 
related to the HasR protein of Serratia marcescens to say that BASB070 is also an 
integral outer membrane protein with a p-barrel conformation. BASB070 or fragments of 
it therefore provide potential vaccine antigens. 

20 

Various changes and modifications within the spirit and scope of the disclosed invention 
will become readily apparent to those skilled in the art from reading the following 
descriptions and from reading the other parts of the present disclosure. 

25 

DESCRIPTION OF THE INVENTION 
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The invention relates to the use of BASB070 polypeptides and polynucleotides as described 
in greater detail below. In particular, the invention relates to the use of polypeptides and 
polynucleotides of a BASB070 of Haemophilus influenzae, which is related by amino acid 
sequence homology to Serratia marcescens HasR hemophore receptor polypeptide. The 
5 invention relates especially to the use of BASB070 having the nucleotide and amino acid 
sequences set out in SEQ ID NO: 1 or 3 and SEQ ID NO:2 or 4 respectively. 

The invention further relates to uses of polynucleotides and polypeptides which have at 
least 85% identity, preferably at least 90% identity, more preferably at least 95% identity, 
10 most preferably at least 97-99% or exact identity to the sequences identified in SEQ ID 
NO: 1 or 3 and SEQ ID NO:2 or 4. 

The invention also relates to novel NTHi polynucleotide and polypeptide sequences 
disclosed herein. 

15 

Polypeptides 

In one aspect of the invention there are provided uses for polypeptides of Haemophilus 
influenzae referred to herein as "BASB070"and " "BASB070 polypeptides" as well as 
20 biologically, diagnostically, prophylactically, clinically or therapeutically useful variants 
thereof, and compositions comprising the same. 

The present invention further provides uses for: 

(a) an isolated polypeptide which comprises an amino acid sequence which has at least 
25 85% identity, more preferably at least 90% identity, yet more preferably at least 95% 

identity, most preferably at least 97-99% or exact identity, to that of SEQ ID NO:2 or 4; 

(b) a polypeptide encoded by an isolated polynucleotide comprising a polynucleotide 
sequence which has at least 85% identity, more preferably at least 90% identity, yet more 

8 
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preferably at least 95% identity, even more preferably at least 97-99% or exact identity to 
SEQ ID NO:l or 3 over the entire length of SEQ ID NO:l or 3 respectively; or 
(c) a polypeptide encoded by an isolated polynucleotide comprising a polynucleotide 
sequence encoding a polypeptide which has at least 85% identity, more preferably at least 
5 90% identity, yet more preferably at least 95% identity, even more preferably at least 97- 
99% or exact identity, to the amino acid sequence of SEQ ID NO:2 or 4. 

The BASB070 polypeptides provided in SEQ ID NO:2 or 4 are the BASB070 polypeptides 
from Haemophilus influenzae strains Rd KW20 and ntHi 3224. 

10 

The invention also provides uses for immunogenic fragments of a BASB070 polypeptide, 
that is, a contiguous portion of the BASB070 polypeptide which has the same or 
substantially the same immunogenic activity as the polypeptide comprising the amino acid 
sequence of SEQ ID NO:2 or 4. That is to say, the fragment (if necessary when coupled to a 

15 carrier) is capable of raising an immune response which recognises the BASB070 

polypeptide. Such an immunogenic fragment may include, for example, the BASB070 
polypeptide lacking an N-terminal leader sequence, and/or a transmembrane domain and.or 
a C-terminal anchor domain. In a' preferred aspect the immunogenic fragment of BASB070 
according to the invention comprises substantially all of the extracellular domain of a 

20 polypeptide which has at least 85% identity, preferably at least 90% identity, more 

preferably at least 95% identity, most preferably at least 97-99% identity, more preferably at 
least more preferably at least 95% identity, most preferably at least 97-99% identity, to 
that of SEQ ID NO:2 or 4 over the entire length of SEQ ID NO:2 or 4. 

25 A fragment is a polypeptide having an amino acid sequence that is entirely the same as part 
but not all of any amino acid sequence of any polypeptide of the invention. As with 
BASB070 polypeptides, fragments may be "free-standing", or comprised within a larger 
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polypeptide of which they form a part or region, most preferably as a single continuous 
region in a single larger polypeptide. 

Preferred fragments include, for example, truncation polypeptides having a portion of an 
5 amino acid sequence of SEQ ID NO:2 or 4 or of a variant thereof, such as a continuous 
series of residues that includes an amino- and/or carboxyl-terminal amino acid sequence. 
Degradation forms of the polypeptides of the invention produced by or in a host cell, are 
also preferred. Further preferred are fragments characterized by structural or functional 
attributes such as fragments that comprise beta-barrels, alpha-helix and alpha-helix forming 
10 regions, beta-sheet and beta-sheet-forming regions, turn and turn-forming regions, coil and 
coil-forming regions, hydrophilic regions, hydrophobic regions, alpha amphipathic regions, 
beta amphipathic regions, flexible regions, surface-forming regions, substrate binding 
region, and high antigenic index regions. 

15 Further preferred fragments include an isolated polypeptide comprising an amino acid 

sequence having at least 15, 20, 30, 40, 50 or 100 contiguous amino acids from the amino 
acid sequence of SEQ ID NO:2 or 4, or an isolated polypeptide comprising an amino acid 
sequence having at least 15, 20, 30, 40, 50 or 100 contiguous amino acids truncated or 
deleted from the amino acid sequence of SEQ ID NO:2 or 4. 

20 

Particularly preferred are variants in which several, 5-10, 1-5, 1-3, 1-2 or 1 amino acids 
are substituted, deleted, or added in any combination. 

The polypeptides, or immunogenic fragments, for use in the invention may be in the 
25 form of the "mature" protein or may be a part of a larger protein such as a precursor or a 
fusion protein. It is often advantageous to include an additional amino acid sequence 
which contains secretory or leader sequences, pro-sequences, sequences which aid in 
purification such as multiple histidine residues, or an additional sequence for stability 



10 



WO 00/50599 



PCT/EP00/01423 



during recombinant production. Furthermore, addition of exogenous polypeptide or 
lipid tail or polynucleotide sequences to increase the immunogenic potential of the final 
molecule is also considered. 

5 In one aspect, the invention relates to the use of genetically engineered soluble fusion 
proteins comprising a polypeptide of the present invention, or a fragment thereof, and 
various portions of the constant regions of heavy or light chains of immunoglobulins of 
various subclasses (IgG, IgM, IgA, IgE). Preferred as an immunoglobulin is the 
constant part of the heavy chain of human IgG, particularly IgGl, where fusion takes 

10 place at the hinge region. In a particular embodiment, the Fc part can be removed 
simply by incorporation of a cleavage sequence which can be cleaved with blood 
clotting factor Xa. 

Examples of fusion protein technology can be found in International Patent Application 
1 5 Nos. W094/29458 and W094/229 1 4. 

The proteins may be chemically conjugated, or expressed as recombinant fusion 
proteins allowing increased levels to be produced in an expression system as compared 
to non-fused protein. The fusion partner may assist in providing T helper epitopes 
20 (immunological fusion partner), preferably T helper epitopes recognised by humans, or 
assist in expressing the protein (expression enhancer) at higher yields than the native 
recombinant protein. Preferably the fusion partner will be both an immunological 
fusion partner and expression enhancing partner. 

25 Fusion partners include protein D from Haemophilus influenzae and the non-structural 
protein from influenza virus, NS1 (hemagglutinin). Another fusion partner is the 
protein known as LytA. Preferably the C terminal portion of the molecule is used. Lyta 
is derived from Streptococcus pneumoniae which synthesize an N-acetyl-L-alanine 
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amidase LytA, (coded by the lytA gene {Gene, 43 (1986) page 265-272}) an autolysin 
that specifically degrades certain bonds in the peptidoglycan backbone. The C-terminal 
domain of the LytA protein is responsible for the affinity to the choline or to some 
choline analogues such as DEAE. This property has been exploited for the development 
5 of E.coli C- LytA expressing plasmids useful for expression of fusion proteins. 

Purification of hybrid proteins containing the C- LytA fragment at its amino terminus 
has been described {Biotechnology: 10, (1992) page 795-798}. It is possible to use the 
repeat portion of the LytA molecule found in the C terminal end starting at residue 178, 
for example residues 188 - 305. 

10 

The present invention also includes variants of the aforementioned polypeptides, that is 
polypeptides that vary from the referents by conservative amino acid substitutions, 
whereby a residue is substituted by another with like characteristics. Typical such 
substitutions are among Ala, Val, Leu and He; among Ser and Thr; among the acidic 
15 residues Asp and Glu; among Asn and Gin; and among the basic residues Lys and Arg; or 
aromatic residues Phe and Tyr. 

Polypeptides for use in the present invention can be prepared in any suitable manner. 
Such polypeptides include isolated naturally occurring polypeptides, recombinantly 
20 produced polypeptides, synthetically produced polypeptides, or polypeptides produced by 
a combination of these methods. Means for preparing such polypeptides are well 
understood in the art. 

It is most preferred that a polypeptide for use in the invention is derived from Haemophilus 
25 influenzae, however, it may preferably be obtained from other organisms of the same 

taxonomic genus. A polypeptide for use in the invention may also be obtained, for example, 
from organisms of the same taxonomic family or order. 
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Polynucleotides 

It is an object of the invention to provide uses for polynucleotides that encode BASB070 
polypeptides, particularly polynucleotides that encode the polypeptide herein designated 
5 BASB070, for use in or in preparation of the vaccine compositions described herein. 

In a particularly preferred embodiment the polynucleotide comprises a region encoding 
BASB070 polypeptides comprising a sequence set out in SEQ ID NO:l or 3 which includes 
a full length gene, or a variant thereof. 

10 

The BASB070 polynucleotides provided in SEQ ID NO: 1 or 3 are the BASB070 
polynucleotides from Haemophilus influenzae strains Rd KW20 and ntHi 3224. 

Using the information provided herein, such as a polynucleotide sequence set out in SEQ ID 
15 NO:l or 3, a polynucleotide of the invention encoding BASB070 polypeptide may be 
obtained using standard cloning and screening methods, such as those for cloning and 
sequencing chromosomal DNA fragments from bacteria, followed by obtaining a full length 
clone. For example, to obtain a polynucleotide sequence for use in the invention, such as a 
polynucleotide sequence given in SEQ ID NO:l or 3, typically a library of clones of 
20 chromosomal DNA of Haemophilus influenzae in E.coli or some other suitable host is 

probed with a radiolabeled oligonucleotide, preferably a 17-mer or longer, derived from a 
partial sequence. Clones carrying DNA identical to that of the probe can then be 
distinguished using stringent hybridization conditions. By sequencing the individual 
clones thus identified by hybridization with sequencing primers designed from the 
25 original polypeptide or polynucleotide sequence it is then possible to extend the 

polynucleotide sequence in both directions to determine a full length gene sequence. 
Conveniently, such sequencing is performed, for example, using denatured double 
stranded DNA prepared from a plasmid clone. Suitable techniques are described by 

13 
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Maniatis, T., Fritsch, E.F. and Sambrook et al., MOLECULAR CLONING, A 
LABORATORY MANUAL, 2nd Ed.; Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, New York (1989). (see in particular Screening By Hybridization 1.90 and 
Sequencing Denatured Double-Stranded DNA Templates 13.70). Direct genomic DNA 
sequencing may also be performed to obtain a full length gene sequence. 

Moreover, the DNA sequence set out in SEQ ID NO:l or 3 contains an open reading frame 
encoding a protein having about the number of amino acid residues set forth in SEQ ID 
NO:2 or 4 with a deduced molecular weight that can be calculated using amino acid residue 
molecular weight values well known to those skilled in the art. 

The polynucleotide of SEQ ID NO:l, between the start codon at nucleotide number 1 and 
the stop codon which begins at nucleotide number 2740 of SEQ ID NO:l, encodes the 
polypeptide of SEQ ID NO:2. 

The polynucleotide of SEQ ID NO:3, between the start codon at nucleotide number 1 and 
the stop codon which begins at nucleotide number 2755 of SEQ ID NO:3, encodes the 
polypeptide of SEQ ID NO:4. 

In a further aspect, the present invention provides uses for an isolated polynucleotide 
comprising or consisting of: 

(a) a polynucleotide sequence which has at least 85% identity, more preferably at least 
90% identity, yet more preferably at least 95% identity, even more preferably at least 
97-99% or exact identity to SEQ IDNO:l or 3 over the entire length of SEQ IDNO:l or 
3 respectively; or 

(b) a polynucleotide sequence encoding a polypeptide which has at least 85% identity, 
more preferably at least 90% identity, yet more preferably at least 95% identity, even 
more preferably at least 97-99% or 100% exact, to the amino acid sequence of SEQ ID 
NO:2 or 4 over the entire length of SEQ ID NO:2 or 4 respectively. 
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A polynucleotide encoding a polypeptide for use in the present invention, including 
homologues and orthologs from species other than Haemophilus influenzae, may be 
obtained by a process which comprises the steps of screening an appropriate library under 
5 stringent hybridization conditions (for example, using a temperature in the range of 45 - 65° 
C and an SDS concentration from 0.1 - 1%) with a labeled or detectable probe consisting of 
or comprising the sequence of SEQ ID NO:l or 3 or a fragment thereof; and isolating a full- 
length gene and/or genomic clones containing said polynucleotide sequence. 

10 The invention provides uses for a polynucleotide sequence identical over its entire length to 
a coding sequence (open reading frame) in SEQ ID NO:l or 3. Also provided by the 
invention are uses for a coding sequence for a mature polypeptide or a fragment thereof, by 
itself as well as a coding sequence for a mature polypeptide or a fragment in reading frame 
with another coding sequence, such as a sequence encoding a leader or secretory sequence, a 

15 pre-, or pro- or prepro-protein sequence. The polynucleotide may also contain at least one 
non-coding sequence, including for example, but not limited to at least one non-coding 5' 
and 3' sequence, such as the transcribed but non-translated sequences, termination signals 
(such as rho-dependent and rho-independent termination signals), ribosome binding sites, 
Kozak sequences, sequences that stabilize mRNA, introns, and polyadenylation signals. 

20 The polynucleotide sequence may also comprise additional coding sequence encoding 

additional amino acids. For example, a marker sequence that facilitates purification of the 
fused polypeptide can be encoded. In certain embodiments of the invention, the marker 
sequence is a hexa-histidine peptide, as provided in the pQE vector (Qiagen, Inc.) and 
described in Gentz et al, Proc. Natl. Acad ScL, USA 86: 821-824 (1989), or an HA peptide 

25 tag (Wilson et al, Cell 37: 767 (1984), both of which may be useful in purifying 

polypeptide sequence fused to them. Polynucleotides for use with the invention also 
include, but are not limited to, polynucleotides comprising a structural gene and its naturally 
associated sequences that control gene expression. 
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The nucleotide sequence encoding BASB070 polypeptide of SEQ ID NO:2 or 4 may be 
identical to the polypeptide encoding sequence contained in nucleotides 1 to 2739 of SEQ 
ID NO:l, or the polypeptide encoding sequence contained in nucleotides 1 to 2754 of SEQ 
5 ID NO:3, respectively. Alternatively it may be a sequence, which as a result of the 
redundancy (degeneracy) of the genetic code, also encodes the polypeptide of SEQ ID 
NO:2or4. 

The term "polynucleotide encoding a polypeptide" as used herein encompasses 
10 polynucleotides that include a sequence encoding a polypeptide of the invention, 

particularly a bacterial polypeptide and more particularly a polypeptide of the Haemophilus 
influenzae BASB070 having an amino acid sequence set out in SEQ ID NO:2 or 4. The 
term also encompasses polynucleotides that include a single continuous region or 
discontinuous regions encoding the polypeptide (for example, polynucleotides interrupted 
15 by integrated phage, an integrated insertion sequence, an integrated vector sequence, an 
integrated transposon sequence, or due to RNA editing or genomic DNA reorganization) 
together with additional regions, that also may contain coding and/or non-coding sequences. 

The invention further relates to variants of the polynucleotides described herein that encode 
20 variants of a polypeptide having a deduced amino acid sequence of SEQ ID NO:2 or 4. 

Fragments of polynucleotides of the invention may be used, for example, to synthesize full- 
length polynucleotides of the invention. 

Further particularly preferred embodiments are polynucleotides encoding B ASB070 
25 variants, that have the amino acid sequence of BASB070 polypeptide of SEQ ID NO:2 or 4 
in which several, a few, 5 to 10, 1 to 5, 1 to 3, 2, 1 or no amino acid residues are substituted, 
modified, deleted and/or added, in any combination. Especially preferred among these are 
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silent substitutions, additions and deletions, that do not alter the properties and activities of 
BASB070 polypeptide. 

Further preferred for use in the invention are polynucleotides that are at least 85% identical 
5 over their entire length to a polynucleotide encoding BASB070 polypeptide having an 
amino acid sequence set out in SEQ ID NO:2 or 4, and polynucleotides that are 
complementary to such polynucleotides. Alternatively, most highly preferred are 
polynucleotides that comprise a region that is at least 90% identical over its entire length to 
a polynucleotide encoding B ASB070 polypeptide and polynucleotides complementary 
10 thereto. In this regard, polynucleotides at least 95% identical over their entire length to the 
same are particularly preferred. Furthermore, those with at least 97% are highly preferred 
among those with at least 95%, and among these those with at least 98% and at least 99% 
are particularly highly preferred, with at least 99% being the more preferred. 

15 Preferred embodiments are polynucleotides encoding polypeptides that retain substantially 
the same biological function or activity as the mature polypeptide encoded by a DNA of 
SEQ IDNO:l or 3. 

In accordance with certain preferred embodiments of this invention there are provided 
20 polynucleotides that hybridize, particularly under stringent conditions, to BASB070 
polynucleotide sequences, such as the polynucleotides in SEQ ID NO:l or 3. 

The invention further relates to polynucleotides that hybridize to the polynucleotide 
sequences provided herein. In this regard, the invention especially relates to polynucleotides 
25 that hybridize under stringent conditions to the polynucleotides described herein. As herein 
used, the terms "stringent conditions" and "stringent hybridization conditions" mean 
hybridization occurring only if there is at least 95% and preferably at least 97% identity 
between the sequences. A specific example of stringent hybridization conditions is 
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overnight incubation at 42°C in a solution comprising: 50% formamide, 5x SSC (150mM 
NaCl, 15mM trisodium citrate), 50 mM sodium phosphate (pH7.6), 5x Denhardt's 
solution, 1 0% dextran sulfate, and 20 micrograms/ml of denatured, sheared salmon sperm 
DNA, followed by washing the hybridization support in O.lx SSC at about 65°C. 
5 Hybridization and wash conditions are well known and exemplified in Sambrook, et aL, 
Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, N.Y., 
(1989), particularly Chapter 1 1 therein. Solution hybridization may also be used with the 
polynucleotide sequences provided by the invention. 

10 A coding region of a BASB070 gene may be isolated by screening using a DNA sequence 
provided in SEQ ID NO:l or 3 to synthesize an oligonucleotide probe. A labeled 
oligonucleotide having a sequence complementary to that of a gene of the invention is then 
used to screen a library of cDNA, genomic DNA or mRNA to determine which members of 
the library the probe hybridizes to. 

15 

There are several methods available and well known to those skilled in the art to obtain 
full-length DNAs, or extend short DNAs, for example those based on the method of Rapid 
Amplification of cDNA ends (RACE) (see, for example, Frohman, et ai, PNAS USA 85: 
8998-9002, 1988). Recent modifications of the technique, exemplified by the Marathon™ 

20 technology (Clontech Laboratories Inc.) for example, have significantly simplified the 
search for longer cDNAs. In the Marathon™ technology, cDNAs have been prepared 
from mRNA extracted from a chosen tissue and an Adaptor 1 sequence ligated onto each 
end. Nucleic acid amplification (PCR) is then carried out to amplify the "missing" 5' end 
of the DNA using a combination of gene specific and adaptor specific oligonucleotide 

25 primers. The PCR reaction is then repeated using "nested" primers, that is, primers 

designed to anneal within the amplified product (typically an adaptor specific primer that 
anneals further 3 1 in the adaptor sequence and a gene specific primer that anneals further 5' 
in the selected gene sequence). The products of this reaction can then be analyzed by 
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DNA sequencing and a full-length DNA constructed either by joining the product directly 
to the existing DNA to give a complete sequence, or carrying out a separate full-length 
PCR using the new sequence information for the design of the 5 1 primer. 

The invention also provides uses for polynucleotides that encode a polypeptide that is the 
mature protein plus additional amino or carboxyl-terminal amino acids, or amino acids 
interior to the mature polypeptide (when the mature form has more than one polypeptide 
chain, for instance). Such sequences may play a role in processing of a protein from 
precursor to a mature form, may allow protein transport, may lengthen or shorten protein 
half-life or may facilitate manipulation of a protein for assay or production, among other 
things. As generally is the case in vivo, the additional amino acids may be processed away 
from the mature protein by cellular enzymes. 

A precursor protein, having a mature form of the polypeptide fused to one or more 
prosequences may be an inactive form of the polypeptide. When prosequences are removed 
such inactive precursors generally are activated. Some or all of the prosequences may be 
removed before activation. Generally, such precursors are called proproteins. 

In accordance with one particular aspect of the invention, there is provided the use of a 
polynucleotide as described herein for therapeutic or prophylactic purposes, in particular 
genetic immunization. This is described in more detail later on in the section headed 
"Vaccines". 

The use of a polynucleotide of the invention in genetic immunization will preferably 
employ a suitable delivery method such as direct injection of plasmid DNA into muscles 
(Wolff et al t Hum Mol Genet (1992) 1: 363, Manthorpe et aL, Hum. Gene Ther. (1983) 4: 
419), delivery of DNA complexed with specific protein carriers (Wu et al. 9 J Biol Chem. 
(1989) 264: 16985), coprecipitation of DNA with calcium phosphate (Benvenisty & 
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Reshef, PNAS USA, (1986) 83: 9551), encapsulation of DNA in various forms of 
liposomes (Kaneda et aL, Science (1989) 243: 375), particle bombardment (Tang et aL, 
Nature (1992) 356:152, Eisenbraun et aL, DNA Cell Biol (1993) 12: 791) and in vivo 
infection using cloned retroviral vectors (Seeger et al, PNAS USA (1984) 81 : 5849). 

5 

Vectors, Host Cells, Expression Systems 

The invention relates to vectors that comprise a polynucleotide or polynucleotides of the 
invention, host cells that are genetically engineered with vectors of the invention and the 
10 production of polypeptides of the invention by recombinant techniques. Cell-free 

translation systems can also be employed to produce such proteins using RNAs derived 
from the DNA constructs of the invention. 

Recombinant polypeptides for use in the present invention may be prepared using 
15 genetically engineered host cells comprising expression systems by processes well known in 
the art. Accordingly, in a further aspect, the present invention relates to expression systems 
that comprise a polynucleotide or polynucleotides of the present invention, to host cells 
which are genetically engineered with such expression systems, and to the production of 
polypeptides of the invention by recombinant techniques. 

20 

For recombinant production of the polypeptides of the invention, host cells can be 
genetically engineered to incorporate expression systems or portions thereof or 
polynucleotides of the invention. Introduction of a polynucleotide into the host cell can 
be effected by methods described in many standard laboratory manuals, such as Davis, et 
25 aL , BASIC METHODS IN MOLECULAR BIOLOGY, (1986) and Sambrook, et al , 

MOLECULAR CLONING: A LABORATORY MANUAL, 2nd Ed., Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, N.Y. (1989), such as, calcium phosphate 
transfection, DEAE-dextran mediated transfection, transvection, microinjection, cationic 
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lipid-mediated transfection, electroporation, transduction, scrape loading, ballistic 
introduction and infection. 

Representative examples of appropriate hosts include bacterial cells, such as cells of 
5 streptococci, staphylococci, enterococcU E. coli, streptomyces, cyanobacteria, Bacillus 
subtilis, Neisseria, Moraxella and Haemophilus influenzae, fungal cells, such as cells of a 
yeast, Kluveromyces, Saccharomyces, a basidiomycete, Candida albicans and Aspergillus: 
insect cells such as cells of Drosophila S2 and Spodoptera Sf9; animal cells such as CHO, 
COS, HeLa, C127, 3T3, BHK, 293, CV-1 and Bowes melanoma cells; and plant cells, such 
1 0 as cells of a gymnosperm or angiosperm. 

A great variety of expression systems can be used to produce the polypeptides of the 
invention. Such vectors include, among others, chromosomal-, episomal- and virus-derived 
vectors, for example, vectors derived from bacterial plasmids, from bacteriophage, from 

15 transposons, from yeast episomes, from insertion elements, from yeast chromosomal 
elements, from viruses such as baculoviruses, papova viruses, such as SV40, vaccinia 
viruses, adenoviruses, fowl pox viruses, pseudorabies viruses, picornaviruses and 
retroviruses, and vectors derived from combinations thereof, such as those derived from 
plasmid and bacteriophage genetic elements, such as cosmids and phagemids. The 

20 expression system constructs may contain control regions that regulate as well as engender 
expression. Generally, any system or vector suitable to maintain, propagate or express 
polynucleotides and/or to express a polypeptide in a host may be used for expression in this 
regard. The appropriate DNA sequence may be inserted into the expression system by any 
of a variety of well-known and routine techniques, such as, for example, those set forth in 

25 Sambrook et aL, MOLECULAR CLONING* A LABORATORY MANUAL, {supra). 

Polypeptides of the invention can be recovered and purified from recombinant cell cultures 
by well-known methods including ammonium sulfate or ethanol precipitation, acid 
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extraction, anion or cation exchange chromatography, phosphocellulose chromatography, 
hydrophobic interaction chromatography, affinity chromatography, hydroxyapatite 
chromatography, and lectin chromatography. Most preferably, high performance liquid 
chromatography is employed for purification. Well-known techniques for refolding protein 
5 may be employed to regenerate active conformation when the polypeptide is denatured 
during isolation and or purification. 

The expression system may also be a recombinant live microorganism, such as a virus or 
bacterium. The gene of interest can be inserted into the genome of a live recombinant 

10 virus or bacterium. Inoculation and in vivo infection with this live vector will lead to in 
vivo expression of the antigen and induction of immune responses. Viruses and bacteria 
used for this purpose are for instance: poxviruses (e.g; vaccinia, fowlpox, canarypox), 
alphaviruses (Sindbis virus, Semliki Forest Virus, Venezuelian Equine Encephalitis 
Virus), adenoviruses, adeno-associated virus, picornaviruses (poliovirus, rhinovirus), 

15 herpesviruses (varicella zoster virus, etc), Listeria, Salmonella Shigella, BCG. These 
viruses and bacteria can be virulent, or attenuated in various ways in order to obtain live 
vaccines. Such live vaccines also form part of the invention. 

Diagnostic, Prognostic, Serotyping and Mutation Assays 

20 

This invention is also related to the use of BASB070 polynucleotides and polypeptides of 
the invention for use as diagnostic reagents. Detection of BASB070 polynucleotides and/or 
polypeptides in a eukaryote, particularly a mammal, and especially a human, will provide a 
diagnostic method for diagnosis of disease, staging of disease or response of an infectious 
25 organism to drugs. Eukaryotes, particularly mammals, and especially humans, particularly 
those infected or suspected to be infected with an organism comprising the B ASB070 gene 
or protein, may be detected at the nucleic acid or amino acid level by a variety of well 
known techniques as well as by methods provided herein. 
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Polypeptides and polynucleotides for prognosis, diagnosis or other analysis may be obtained 
from a putatively infected and/or infected individual's bodily materials. Polynucleotides 
from any of these sources, particularly DNA or RNA, may be used directly for detection or 
may be amplified enzymatically by using PCR or any other amplification technique prior to 
analysis. RNA, particularly mRNA, cDNA and genomic DNA may also be used in the 
same ways. Using amplification, characterization of the species and strain of infectious or 
resident organism present in an individual, may be made by an analysis of the genotype of a 
selected polynucleotide of the organism. Deletions and insertions can be detected by a 
change in size of the amplified product in comparison to a genotype of a reference sequence 
selected from a related organism, preferably a different species of the same genus or a 
different strain of the same species. Point mutations can be identified by hybridizing 
amplified DNA to labeled BASB070 polynucleotide sequences. Perfectly or significantly 
matched sequences can be distinguished from imperfectly or more significantly mismatched 
duplexes by DNase or RNase digestion, for DNA or RNA respectively, or by detecting 
differences in melting temperatures or renaturation kinetics. Polynucleotide sequence 
differences may also be detected by alterations in the electrophoretic mobility of 
polynucleotide fragments in gels as compared to a reference sequence. This may be carried 
out with or without denaturing agents. Polynucleotide differences may also be detected by 
direct DNA or RNA sequencing. See, for example, Myers et al 9 Science, 230: 1242 (1985). 
Sequence changes at specific locations also may be revealed by nuclease protection assays, 
such as RNase, VI and SI protection assay or a chemical cleavage method. See, for 
example, Cotton etal, Proa Natl Acad Sci. t USA, 85: 4397-4401 (1985). 

In another embodiment, an array of oligonucleotides probes comprising BASB070 
nucleotide sequence or fragments thereof can be constructed to conduct efficient screening 
of, for example, genetic mutations, serotype, taxonomic classification or identification. 
Array technology methods are well known and have general applicability and can be used to 
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address a variety of questions in molecular genetics including gene expression, genetic 
linkage, and genetic variability (see, for example, Chee et a/., Science, 274: 610 (1996)). 

Thus in another aspect, the present invention relates to a diagnostic kit which comprises: 
5 (a) a polynucleotide of the present invention, preferably the nucleotide sequence of SEQ 
ID NO:l or 3, or a fragment thereof ; (b) a nucleotide sequence complementary to that of 
(a); (c) a polypeptide of the present invention, preferably the polypeptide of SEQ ID NO:2 
or 4 or a fragment thereof; or (d) an antibody to a polypeptide of the present invention, 
preferably to the polypeptide of SEQ ID NO:2 or 4. 

10 

It will be appreciated that in any such kit, (a), (b), (c) or (d) may comprise a substantial 
component. Such a kit will be of use in diagnosing a disease or susceptibility to a 
Disease, among others. 

15 This invention also relates to the use of polynucleotides of the present invention as 

diagnostic reagents. Detection of a mutated form of a polynucleotide of the invention, 
preferably SEQ ID NO:l or 3, which is associated with a disease or pathogenicity will 
provide a diagnostic tool that can add to, or define, a diagnosis of a disease, a prognosis of a 
course of disease, a determination of a stage of disease, or a susceptibility to a disease, 

20 which results from under-expression, over-expression or altered expression of the 

polynucleotide. Organisms, particularly infectious organisms, carrying mutations in such 
polynucleotide may be detected at the polynucleotide or polypeptide level by a variety of 
techniques, such as those described elsewhere herein. 

25 The nucleotide sequences of the present invention are also valuable for organism 

chromosome identification. The sequence is specifically targeted to, and can hybridize with, 
a particular location on an organism's chromosome, particularly to a Haemophilus 
influenzae chromosome. The mapping of relevant sequences to chromosomes according to 
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the present invention may be an important step in correlating those sequences with 
pathogenic potential and/or an ecological niche of an organism and/or drug resistance of an 
organism, as well as the essentiality of the gene to the organism. Once a sequence has been 
mapped to a precise chromosomal location, the physical position of the sequence on the 
5 chromosome can be correlated with genetic map data. Such data may be found on-line in a 
sequence database. The relationship between genes and diseases that have been mapped to 
the same chromosomal region are then identified through known genetic methods, for 
example, through linkage analysis (coinheritance of physically adjacent genes) or mating 
studies, such as by conjugation. 

10 

The differences in a polynucleotide and/or polypeptide sequence between organisms 
possessing a first phenotype and organisms possessing a different, second different 
phenotype can also be determined. If a mutation is observed in some or all organisms 
possessing the first phenotype but not in any organisms possessing the second phenotype, 
15 then the mutation is likely to be the causative agent of the first phenotype. 

Cells from an organism carrying mutations or polymorphisms (allelic variations) in a 
polynucleotide and/or polypeptide of the invention may also be detected at the 
polynucleotide or polypeptide level by a variety of techniques, to allow for serotyping, for 
20 example. For example, RT-PCR can be used to detect mutations in the RNA. It is 

particularly preferred to use RT-PCR in conjunction with automated detection systems, such 
as, for example, GeneScan. RNA, cDNA or genomic DNA may also be used for the same 
purpose, PGR. As an example, PCR primers complementary to a polynucleotide encoding 
BASB070 polypeptide can be used to identify and analyze mutations. 

25 

The invention further provides primers for, among other things, amplifying BASB070 DNA 
and/or RNA isolated from a sample derived from an individual, such as a bodily material. 
The primers may be used to amplify a polynucleotide isolated from an infected individual, 



25 



WO 00/50599 



PCT/EPOO/01423 



such that the polynucleotide may then be subject to various techniques for elucidation of the 
polynucleotide sequence. In this way, mutations in the polynucleotide sequence may be 
detected and used to diagnose and/or give a prognosis for the infection or its stage or course, 
or to serotype and/or classify the infectious agent. 

The invention further provides a process for diagnosing, disease, preferably bacterial 
infections, more preferably infections caused by Haemophilus influenzae, comprising 
determining from a sample derived from an individual, such as a bodily material, an 
increased level of expression of polynucleotide having a sequence of Table 1 [SEQ ID 
NO: 1 or 3]. Increased or decreased expression of a BASB070 polynucleotide can be 
measured using any on of the methods well known in the art for the quantitation of 
polynucleotides, such as, for example, amplification, PCR, RT-PCR, RNase protection, 
Northern blotting, spectrometry and other hybridization methods. 

In addition, a diagnostic assay in accordance with the invention for detecting over- or under- 
expression of BASB070 polypeptide compared to normal control tissue samples may be 
used to detect the presence of an infection, for example. Assay techniques that can be used 
to determine levels of a BASB070 polypeptide, in a sample derived from a host, such as a 
bodily material, are well known to those of skill in the art. Such assay methods include 
radioimmunoassays, competitive-binding assays, Western Blot analysis, antibody sandwich 
assays, antibody detection and ELISA assays. 

The polynucleotides of the invention may be used as components of polynucleotide 
arrays, preferably high-density arrays or grids. These high-density arrays are 
particularly useful for diagnostic and prognostic purposes. For example, a set of spots 
each comprising a different gene, and further comprising a polynucleotide or 
polynucleotides of the invention, may be used for probing, such as using hybridization 
or nucleic acid amplification, using probes obtained or derived from a bodily sample, to 
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determine the presence of a particular polynucleotide sequence or related sequence in an 
individual. Such a presence may indicate the presence of a pathogen, particularly 
Haemophilus influenzae^ and may be useful in diagnosing and/or giving a prognosis for 
disease or a course of disease. A grid comprising a number of variants of the 
5 polynucleotide sequence of SEQ ID NO: 1 is preferred. Also preferred is a 

gridcomprising a number of variants of a polynucleotide sequence encoding the 
polypeptide sequence of SEQ ID NO: 2. 

Antibodies 

10 

The polypeptides and polynucleotides of the invention or variants thereof, or cells 
expressing the same can be used as immunogens to produce antibodies immunospecific for 
such polypeptides or polynucleotides respectively. 

15 In certain preferred embodiments of the invention there are provided antibodies against 
BASB070 polypeptides or polynucleotides. 

Antibodies generated against the polypeptides or polynucleotides of the invention can be 
obtained by administering the polypeptides and/or polynucleotides of the invention, or 

20 epitope-bearing fragments of either or both, analogues of either or both, or cells expressing 
either or both, to an animal, preferably a nonhuman, using routine protocols. For 
preparation of monoclonal antibodies, any technique known in the art that provides 
antibodies produced by continuous cell line cultures can be used. Examples include various 
techniques, such as those in Kohler, G. and Milstein, C, Nature 256: 495-497 (1975); 

25 Kozbor et al % Immunology Today 4: 72 (1983); Cole et al, pg. 77-96 in MONOCLONAL 
ANTIBODIES AND CANCER THERAPY, Alan R. Liss, Inc. (1985). 
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Techniques for the production of single chain antibodies (U.S. Patent No. 4,946,778) can be 
adapted to produce single chain antibodies to polypeptides or polynucleotides of this 
invention. Also, transgenic mice, or other organisms such as other mammals, may be used 
to express humanized antibodies immunospecific to the polypeptides or polynucleotides of 
5 the invention. 

Alternatively, phage display technology may be utilized to select genes for antibodies 
with binding activities towards a polypeptide of the invention either from repertoires of 
PCR amplified v-genes of lymphocytes from humans screened for possessing anti- 
10 BASB070 or from naive libraries (McCafferty, et aL 9 (1990), Nature 348, 552-554; 

Marks, et aL, (1992) Biotechnology 10, 779-783). The affinity of these antibodies can 
also be improved by, for example, chain shuffling (Clackson et al, (1991) Nature 352: 
628). 

15 The above-described antibodies may be employed to isolate or to identify clones expressing 
the polypeptides or polynucleotides of the invention to purify the polypeptides or 
polynucleotides by, for example, affinity chromatography. 

Thus, among others, antibodies against BASB070-poIypeptide or BASB070-polynucleotide 
20 may be employed to treat infections, particularly bacterial infections. 

Polypeptide variants include antigenically, epitopically or immunologically equivalent 
variants form a particular aspect of this invention. 

25 Preferably, the antibody or variant thereof is modified to make it less immunogenic in the 
individual. For example, if the individual is human the antibody may most preferably be 
"humanized," where the complementarity determining region or regions of the 
hybridoma-derived antibody has been transplanted into a human monoclonal antibody, for 
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example as described in Jones et al (1986), Nature 321, 522-525 or Tempest et al* 
(1991) Biotechnology 9, 266-273. 

In a further aspect, the present invention relates to genetically engineered soluble fusion 
5 proteins comprising a polypeptide of the present invention, or a fragment thereof, and 
various portions of the constant regions of heavy or light chains of immunoglobulins of 
various subclasses (IgG, IgM, IgA, IgE). Preferred as an immunoglobulin is the constant 
part of the heavy chain of human IgG, particularly IgG 1, where fusion takes place at the 
hinge region. In a particular embodiment, the Fc part can be removed simply by 

10 incorporation of a cleavage sequence which can be cleaved with blood clotting factor Xa. 
Furthermore, this invention relates to processes for the preparation of these fusion 
proteins by genetic engineering, and to the use thereof for drug screening, diagnosis and 
therapy. A further aspect of the invention also relates to polynucleotides encoding such 
fusion proteins. Examples of fusion protein technology can be found in International 

15 Patent Application Nos. W094/29458 and W094/22914. 

Mimotopes 

In a further aspect, the present invention relates to mimotopes of the polypeptide of the 
20 invention. A mimotope is generally a peptide sequence, sufficiently similar to the 

native peptide (sequentially or structurally), which is capable of binding to the binding 
site of the native peptide. Thus where an antibody-binding peptide is concerned, a 
mimotope is capable of being recognised by antibodies which recognise the native 
peptide; or is capable of raising antibodies which recognise the native peptide, 
25 optionally when coupled to a suitable carrier. In the case of T cell recognition, a 

mimotope is capable of being recognised by the same T cells that recognise the native 
peptide; or is capable of generating a T cell response which recognises the native 
peptide. 
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Peptide mimotopes may be designed for a particular purpose by addition, deletion or 
substitution of elected amino acids. Thus, the peptides may be modified for the purposes 
of ease of conjugation to a protein carrier. For example, it may be desirable for some 
chemical conjugation methods to include a terminal cysteine. In addition it may be 
desirable for peptides conjugated to a protein carrier to include a hydrophobic terminus 
distal from the conjugated terminus of the peptide, such that the free unconjugated end 
of the peptide remains associated with the surface of the carrier protein. Thereby 
presenting the peptide in a conformation which most closely resembles that of the 
peptide as found in the context of the whole native molecule. For example, the peptides 
may be altered to have an N-terminal cysteine and a C-terminal hydrophobic amidated 
tail. Alternatively, the addition or substitution of a D-stereoisomer form of one or more 
of the amino acids may be performed to create a beneficial derivative, for example to 
enhance stability of the peptide and/or to increase the affinity of the peptide for a 
particular ligand. 

Mimotopes may also be retro sequences of the natural peptide sequences, in that the 
sequence orientation is reversed; or alternatively the sequences may be entirely or at 
least in part comprised of D-stereoisomer amino acids (inverso sequences). Also, the 
peptide sequences may be retro-inverso in character, in that the sequence orientation is 
reversed and the amino acids are of the D-stereoisomer form. Retro, inverso and retro- 
inverso peptides are described in W095/24916 and WO94/0531 L 

Alternatively, peptide mimotopes may be identified using antibodies which are capable 
themselves of binding to the polypeptides of the present invention using techniques such 
as phage display technology (EP 0 552 267 Bl). This technique, generates a large number 
of peptide sequences which mimic the structure of the native peptides and are, therefore, 
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capable of binding to anti-native peptide antibodies, but may not necessarily themselves 
share significant sequence homology to the native polypeptide. 

Vaccines 

5 

One particularly important aspect of the invention relates to a method for inducing an 
immunological response in an individual, particularly a mammal, preferably humans, 
which comprises inoculating the individual with a BASB070 polynucleotide and/or 
polypeptide, or a fragment, or a mimotope, or variant thereof, adequate to produce 
10 antibody and/or T cell immune response to protect said individual from infection, 

particularly bacterial infection and most particularly Haemophilus influenzae infection. 
Also provided are methods whereby such immunological response slows bacterial 
replication. 

1 5 Yet another aspect of the invention relates to a method of inducing an immunological 
response in an individual which comprises delivering to such individual a nucleic acid 
vector, sequence or ribozyme to direct expression of BASB070 polynucleotide and/or 
polypeptide, or a fragment, or a mimotope, or a variant thereof, for expressing BASB070 
polynucleotide and/or polypeptide, or a fragment, or a mimotope, or a variant thereof in 

20 vivo in order to induce an immunological response, such as, to produce antibody and/or T 
cell immune response, including, for example, cytokine-producing T cells or cytotoxic T 
cells, to protect said individual, preferably a human, from disease, whether that disease is 
already established within the individual or not. One example of administering the gene is 
by accelerating it into the desired cells as a coating on particles or otherwise. Such 

25 nucleic acid vectors may comprise DNA, RNA, a ribozyme, a modified nucleic acid, a 
DNA/RNA hybrid, a DNA-protein complex or an RNA-protein complex. 
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A further aspect of the invention relates to an immunological composition that when 
introduced into an individual, preferably a human, capable of having induced within it an 
immunological response, induces an immunological response in such individual to a 
BASB070 polynucleotide and/or polypeptide encoded therefrom, wherein the composition 
comprises a recombinant B ASB070 polynucleotide and/or polypeptide encoded 
therefrom, or a fragment, or a mimotope, or a variant thereof, and/or comprises DNA 
and/or RNA which encodes and expresses an antigen of said BASB070 polynucleotide, 
polypeptide encoded therefrom, or other polypeptide of the invention, such as a fragment 
or a mimotope or a variant. The immunological response may be used therapeutically or 
prophylactically and may take the form of antibody immunity and/or cellular immunity, 
such as cellular immunity arising from CTL or CD4+ T cells. 

A BASB070 polypeptide or a fragment thereof may be fused with a co-protein or 
chemical moiety which may or may not by itself produce antibodies or induce a T cell 
response, but which is capable of stabilizing the first protein and producing a fused or 
modified protein which will have antigenic and/or immunogenic properties, and 
preferably protective properties. Thus fused recombinant protein, preferably further 
comprises an antigenic co-protein, such as lipoprotein D from Haemophilus influenzae, 
Glutathione-S-transferase (GST) or beta-galactosidase, or any other relatively large co- 
protein which solubilizes the protein and facilitates production and purification thereof. 
Moreover, the co-protein may act as an adjuvant in the sense of providing a generalized 
stimulation of the immune system of the organism receiving the protein. The co-protein 
may be attached to either the amino- or carboxy-terminus of the first protein. 

In a vaccine composition according to the invention, a BASB070 polynucleotide and/or 
polypeptide, or a fragment, or a mimotope, or a variant thereof may be present in or 
encoded by a vector, such as the live recombinant vectors described above for example 
live bacterial vectors. 
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Also suitable are non-live vectors for the BASB070 polypeptide, for example bacterial 
outer-membrane vesicles or "blebs". OM blebs are derived from the outer membrane of 
the two-layer membrane of Gram-negative bacteria and have been documented in many 
Gram-negative bacteria (Zhou, L et aL 1 998. FEMS Microbiol. Lett. 163:223-228) 
including C trachomatis and C. psittaci. A non-exhaustive list of bacterial pathogens 
reported to produce blebs also includes: Bordetella pertussis, Borrelia burgdorferi, 
Brucella melitensis, Brucella ovis, Esherichia coli, Haemophilus influenza, Legionella 
pneumophila, Neisseria gonorrhoeae, Neisseria meningitidis, Pseudomonas aeruginosa 
and Yersinia enterocolitica. 

Blebs have the advantage of providing outer-membrane proteins in their native 
conformation and are thus particularly useful for vaccines. Blebs can also be improved 
for vaccine use by engineering the bacterium so as to modify the expression of one or 
more molecules at the outer membrane. Thus for example the expression of a desired 
immunogenic protein at the outer membrane, such as the BASB070 polypeptide, can be 
introduced or upregulated (e.g. by altering the promoter). Instead or in addition, the 
expression of outer-membrane molecules which are either not relevant (e.g. unprotective 
antigens or immunodominant but variable proteins) or detrimental (e.g. toxic molecules 
such as LPS, or potential inducers of an autoimmune response) can be downregulated. 
These approaches are discussed in more detail below. 

The non-coding flanking regions of the BASB070 gene contain regulatory elements 
important in the expression of the gene. This regulation takes place both at the 
transcriptional and translational level. The sequence of these regions, either upstream or 
downstream of the open reading frame of the gene, can be obtained by DNA sequencing. 
This sequence information allows the determination of potential regulatory motifs such as 
the different promoter elements, terminator sequences, inducible sequence elements, 
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repressors, elements responsible for phase variation, the shine-dalgarno sequence, regions 
with potential secondary structure involved in regulation, as well as other types of 
regulatory motifs or sequences. 

5 This sequence information allows the modulation of the natural expression of the 
BASB070 gene. The upregulation of the gene expression may be accomplished by 
altering the promoter, the shine-dalgarno sequence, potential repressor or operator 
elements, or any other elements involved. Likewise, downregulation of expression can be 
achieved by similar types of modification. Alternatively, by changing phase variation 

10 sequences, the expression of the gene can be put under phase variation control, or it may 
be uncoupled from this regulation. In another approach, the expression of the gene can be 
put under the control of one or more inducible elements allowing regulated expression. 
Examples of such regulation include, but are not limited to, induction by temperature 
shift, addition of inductor substrates like selected carbohydrates or their derivatives, trace 

15 elements, vitamins, co-factors, metal ions, etc. 

Such modifications as described above can be introduced by several different means. The 
modification of sequences involved in gene expression can be carried out in vivo by 
random mutagenesis followed by selection for the desired phenotype. Another approach 

20 consists in isolating the region of interest and modifying it by random mutagenesis, or 

site-directed replacement, insertion or deletion mutagenesis. The modified region can then 
be reintroduced into the bacterial genome by homologous recombination, and the effect 
on gene expression can be assessed. In another approach, the sequence knowledge of the 
region of interest can be vised to replace or delete all or part of the natural regulatory 

25 sequences. In this case, the regulatory region targeted is isolated and modified so as to 
contain the regulatory elements from another gene, a combination of regulatory elements 
from different genes, a synthetic regulatory region, or any other regulatory region, or to 
delete selected parts of the wild-type regulatory sequences. These modified sequences can 
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then be reintroduced into the bacterium via homologous recombination into the genome. 
A non-exhaustive list of preferred promoters that could be used for up-regulation of gene 
expression includes the promoters porA, porB, lbpB, tbpB, pi 10, 1st, hpuAB from N. 
meningitidis or N. gonorrohea;, ompCD, copB, lbpB, ompE, UspAl, UspA2, TbpB from 
M Catarrhalis; pi, p2, p4, p5, p6, IpD, tbpB, D15, Hia, Hmwl, Hmw2 from K 
influenzae. 

In one example, the expression of the gene can be modulated by exchanging its promoter 
with a stronger promoter (through isolating the upstream sequence of the gene, in vitro 
modification of this sequence, and reintroduction into the genome by homologous 
recombination). Upregulated expression can be obtained in both the bacterium as well as 
in the outer membrane vesicles shed (or made) from the bacterium. 

In other examples, the described approaches can be used to generate recombinant bacterial 
strains with improved characteristics for vaccine applications. These can be, but are not 
limited to, attenuated strains, strains with increased expression of selected antigens, 
strains with knock-outs (or decreased expression) of genes interfering with the immune 
response, strains with modulated expression of immunodominant proteins, strains with 
modulated shedding of outer-membrane vesicles. 

Thus, also provided by the invention is a modified upstream region of the BASB070 gene, 
which modified upstream region contains a heterologous regulatory element which alters 
the expression level of the BASB070 protein located at the outer membrane. The 
upstream region according to this aspect of the invention includes the sequence upstream 
of the BASB070 gene. The upstream region starts immediately upstream of the BASB070 
gene and continues usually to a position no more than about 1000 bp upstream of the gene 
from the ATG start codon. In the case of a gene located in a polycistronic sequence 
(operon) the upstream region can start immediately preceding the gene of interest, or 
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preceding the first gene in the operon. Preferably, a modified upstream region according to 
this aspect of the invention contains a heterologous promotor at a position between 500 and 
700 bp upstream of the ATG. 

5 Thus, the invention provides the BASB070 polypeptide, in a modified bacterial Bleb. The 
invention further provides modified host cells capable of producing the non-live membrane- 
based bleb vectors. The invention further provides vectors comprising the BASB070 gene 
having a modified upstream region containing a heterologous regulatory element. 

10 Further provided by the invention are processes to prepare the host cells and bacterial blebs 
according to the invention. 

Vaccine antigens may be provided in a variety of other forms known in the art, depending 
on the properties of the protein. Lipoproteins for example, because of the hydrophobicity 
15 of the lipids added to their N-terminus, are able to aggregate and to form micelles. The 
particulate nature of these structures can enhance the immunogenicity of the lipoprotein, 
as compared to the unlipidated version of the protein. The size of the micelles may also 
have an impact on the immunogneicity of the lipoprotein and this can be modified for 
example by adjusting the extraction procedure. 

20 

Also provided by this invention are compositions, particularly vaccine compositions, and 
methods comprising the polypeptides and/or polynucleotides of the invention and 
immunostimulatory DNA sequences, such as those described in Sato, Y. et al. Science 
273: 352 (1996). 

25 

Also, provided by this invention are methods using the described polynucleotide or 
particular fragments thereof, which have been shown to encode non-variable regions of 
bacterial cell surface proteins, in polynucleotide constructs used in such genetic 
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immunization experiments in animal models of infection with Haemophilus influenzae. 
Such experiments will be particularly useful for identifying protein epitopes able to 
provoke a prophylactic or therapeutic immune response. It is believed that this approach 
will allow for the subsequent preparation of monoclonal antibodies of particular value, 
5 derived from the requisite organ of the animal successfully resisting or clearing infection, 
for the development of prophylactic agents or therapeutic treatments of bacterial infection, 
particularly Haemophilus influenzae infection, in mammals, particularly humans. 

The invention also includes a vaccine formulation which comprises an immunogenic 
10 recombinant polypeptide and/or polynucleotide of the invention together with a suitable 
carrier, such as any pharmaceutically acceptable carrier. Since the polypeptides and 
polynucleotides may be broken down in the stomach, each could be administered via a 
mucosal surface such as intranasally, or administered parenterally, including, for example, 
administration that is subcutaneous, intramuscular, intravenous, or intradermal. 
15 Formulations suitable for parenteral administration include aqueous and non-aqueous 
sterile injection solutions which may contain anti-oxidants, buffers, bacteriostatic 
compounds and solutes which render the formulation isotonic with the bodily fluid, 
preferably the blood, of the individual; and aqueous and non-aqueous sterile suspensions 
which may include suspending agents or thickening agents. The formulations may be 
20 presented in unit-dose or multi-dose containers, for example, sealed ampoules and vials 
and may be stored in a freeze-dried condition requiring only the addition of the sterile 
liquid carrier immediately prior to use. 

The vaccine formulation of the invention may also include adjuvant systems for 
25 enhancing the immunogenicity of the formulation. 

An immune response may be broadly distinguished into two extreme catagories, being a 
humoral or cell mediated immune responses (traditionally characterised by antibody and 
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cellular effector mechanisms of protection respectively). These categories of response 
have been termed THl-type responses (cell-mediated response), and TH2-type immune 
responses (humoral response). 

5 Extreme THl-type immune responses may be characterised by the generation of antigen 
specific, haplotype restricted cytotoxic T lymphocytes, and natural killer cell responses. 
In mice THl-type responses are often characterised by the generation of antibodies of 
the IgG2a subtype, whilst in the human these correspond to IgGl type antibodies. TH2- 
type immune responses are characterised by the generation of a broad range of 

10 immunoglobulin isotypes including in mice IgGl, IgA, and IgM. 

It can be considered that the driving force behind the development of these two types of 
immune responses are cytokines. High levels of THl-type cytokines tend to favour the 
induction of cell mediated immune responses to the given antigen, whilst high levels of 
15 TH2-type cytokines tend to favour the induction of humoral immune responses to the 
antigen. 

The distinction of TH1 and TH2-type immune responses is not absolute. In reality an 
individual will support an immune response which is described as being predominantly 

20 TH1 or predominantly TH2. However, it is often convenient to consider the families of 
cytokines in terms of that described in murine CD4 T cell clones by Mosmann and 
Coffman {Mosmann, T.R. andCoffman, R.L. (1989) TH1 and TH2 cells: different 
patterns oflymphokine secretion lead to different functional properties. Annual Review 
of Immunology, 7, pi 45-1 73). Traditionally, THl-type responses are associated with 

25 the production of the IFN-y and IL-2 cytokines by T-lymphocytes. Other cytokines 
often directly associated with the induction of THl-type immune responses are not 
produced by T-cells, such as IL-12. In contrast, TH2- type responses are associated with 
the secretion of IL-4, IL-5, IL-6 and IL-13. 
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It is known that certain vaccine adjuvants are particularly suited to the stimulation of 
either TH1 or TH2 - type cytokine responses. Traditionally the best indicators of the 
TH1:TH2 balance of the immune response after a vaccination or infection includes 
5 direct measurement of the production of TH1 or TH2 cytokines by T lymphocytes in 
vitro after restimulation with antigen, and/or (in the murine system) the measurement of 
the IgGl :IgG2a ratio of antigen specific antibody responses. 

Thus, a THl-type adjuvant is one which preferentially stimulates isolated T-cell 
10 populations to produce a high ratio of THl-type cytokines when re-stimulated with 

antigen in vitro, and promotes development of both CD8+ cytotoxic T lymphocytes and 
antigen specific immunoglobulin responses associated with THl-type isotype. 
Adjuvants which are capable of preferential stimulation of the TH1 cell response are 
described in International Patent Application No. WO 94/00153 and WO 95/17209. 

15 

3 De-O-acylated monophosphoryl lipid A (3D-MPL) is one such adjuvant. This is 
known from GB 22202 1 1 (Ribi). Chemically it is a mixture of 3 De-O-acylated 
monophosphoryl lipid A with 4, 5 or 6 acylated chains and is manufactured by Ribi 
Immunochem, Montana. A preferred form of 3 De-O-acylated monophosphoryl lipid 
20 A is disclosed in European Patent 0 689 454 Bl (SmithKline Beecham Biologicals SA). 

Preferably, the particles of 3D-MPL are small enough to be sterile filtered through a 
0.22micron membrane (European Patent number 0 689 454). 
3D-MPL will be present in the range of lO^g - lOOfig preferably 25-50fag per dose 
25 wherein the antigen will typically be present in a range 2-50jag per dose. 

Another preferred adjuvant comprises QS21, an HPLC purified non-toxic fraction 
derived from the bark of Quillaja Saponaria Molina. Optionally this may be admixed 
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with 3 De-O-acylated monophosphoryl lipid A (3D-MPL), optionally together with a 
carrier. 

The method of production of QS21 is disclosed in US patent No. 5,057,540. 

5 

Non-reactogenic adjuvant formulations containing QS21 have been described 
previously (WO 96/33739). Such formulations comprising QS21 and cholesterol have 
been shown to be successful TH1 stimulating adjuvants when formulated together with 
an antigen. 

10 

Further adjuvants which are preferential stimulators of TH1 cell responses include 
immunomodulatory oligonucleotides, for example unmethylated CpG sequences as 
disclosed in WO 96/02555. 

1 5 Combinations of different TH1 stimulating adjuvants, such as those mentioned 

hereinabove, are also contemplated as providing an adjuvant which is a preferential 
stimulator of TH1 cell response. For example, QS21 can be formulated together with 
3D-MPL. The ratio of QS21 : 3D-MPL will typically be in the order of 1 : 10 to 10 : 1 ; 
preferably 1:5 to 5 : 1 and often substantially 1:1. The preferred range for optimal 

20 synergy is 2.5 : 1 to 1 : 1 3D-MPL: QS21. 

Preferably a carrier which enhances immunogenicity is also present in the vaccine 
composition according to the invention. Such a carrier may be an oil in water emulsion, 
or an aluminium salt, such as aluminium phosphate or aluminium hydroxide. 

25 

A preferred oil-in-water emulsion comprises a metabolisible oil, such as squalene, alpha 
tocopherol and Tween 80. In a particularly preferred aspect the antigens in the vaccine 
composition according to the invention are combined with QS21 and 3D-MPL in such 
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an emulsion. Additionally the oil in water emulsion may contain span 85 and/or lecithin 
and/or tricaprylin. 

Typically for human administration QS21 and 3D-MPL will be present in a vaccine in 
the range of lug - 200ug, such as 10-100ug, preferably lOug - 50ug per dose. 
Typically the oil in water will comprise from 2 to 10% squalene, from 2 to 10% alpha 
tocopherol and from 0.3 to 3% tween 80. Preferably the ratio of squalene: alpha 
tocopherol is equal to or less than 1 as this provides a more stable emulsion. Span 85 
may also be present at a level of 1%. In some cases it may be advantageous that the 
vaccines of the present invention will further contain a stabiliser. 

Non-toxic oil in water emulsions preferably contain a non-toxic oil, e.g. squalane or 
squalene, an emulsifier, e.g. Tween 80, in an aqueous carrier. The aqueous carrier may 
be, for example, phosphate buffered saline. 

A particularly potent adjuvant formulation involving QS21, 3D-MPL and tocopherol in 
an oil in water emulsion is described in WO 95/17210. 

The present invention also provides a polyvalent vaccine composition comprising a 
) vaccine formulation of the invention in combination with other antigens, in particular 
antigens useful for treating other bacterial or viral diseases, cancers, autoimmune diseases 
and related conditions. Such a polyvalent vaccine composition may include a TH-1 
inducing adjuvant as hereinbefore described. 

5 While the invention has been described with reference to certain BASB070 polypeptides 
and polynucleotides, it is to be understood that this covers fragments of the naturally 
occurring polypeptides and polynucleotides, and similar polypeptides and polynucleotides 



41 



WO 00/50599 



PCT/EP00/01423 



with additions, deletions or substitutions which do not substantially affect the 
immunogenic properties of the recombinant polypeptides or polynucleotides. 

Compositions, kits and administration 

In a further aspect of the invention there are provided compositions comprising a BASB070 
polynucleotide and/or BASB070 polypeptide for administration to a cell or to a multicellular 
organism. 

The invention also relates to compositions comprising a polynucleotide and/or a 
polypeptides discussed herein or their agonists or antagonists. The polypeptides and 
polynucleotides of the invention may be employed in combination with a non-sterile or 
sterile carrier or carriers for use with cells, tissues or organisms, such as a pharmaceutical 
carrier suitable for administration to an individual. Such compositions comprise, for 
instance, a media additive or a therapeutically effective amount of a polypeptide and/or 
polynucleotide of the invention and a pharmaceutically acceptable carrier or excipient. Such 
carriers may include, but are not limited to, saline, buffered saline, dextrose, water, glycerol, 
ethanol and combinations thereof. The formulation should suit the mode of administration. 
The invention further relates to diagnostic and pharmaceutical packs and kits comprising 
one or more containers filled with one or more of the ingredients of the aforementioned 
compositions of the invention. 

Polypeptides, polynucleotides and other compounds of the invention may be employed 
alone or in conjunction with other compounds, such as therapeutic compounds. 

The pharmaceutical compositions may be administered in any effective, convenient manner 
including, for instance, administration by topical, oral, anal, vaginal, intravenous, 
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intraperitoneal, intramuscular, subcutaneous, intranasal intradermal transdermal routes 
among others. 

In therapy or as a prophylactic, the active agent may be administered to an individual as 
5 an injectable composition, for example as a sterile aqueous dispersion, preferably 

isotonic. 

In a further aspect, the present invention provides for pharmaceutical compositions 
comprising a therapeutically effective amount of a polypeptide and/or polynucleotide, such 
l0 as the soluble form of a polypeptide and/or polynucleotide of the present invention, agonist 
or antagonist peptide or small molecule compound, in combination with a pharmaceutically 
acceptable carrier or excipient. Such carriers include, but are not limited to, saline, buffered 
saline, dextrose, water, glycerol, ethanol, and combinations thereof. The invention further 
relates to pharmaceutical packs and kits comprising one or more containers filled with one 
15 or more of the ingredients of the aforementioned compositions of the invention. 

Polypeptides, polynucleotides and other compounds of the present invention may be 
employed alone or in conjunction with other compounds, such as therapeutic compounds. 

The composition will be adapted to the route of administration, for instance by a systemic or 
20 an oral route. Preferred forms of systemic administration include injection, typically by 
intramuscular or subcutaneous injection. Other injection routes, such as intradermal, 
intraperitoneal, or intravenous can be used. Alternative means for systemic administration 
include transmucosal and transdermal adrninistration using penetrants such as bile salts or 
fusidic acids or other detergents. In addition, if a polypeptide or other compounds of the 
25 present invention can be formulated in an enteric or an encapsulated formulation, oral 
administration may also be possible. Administration of these compounds may also be 
topical and/or localized, in the form of salves, pastes, gels, solutions, powders and the like. 
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For administration to mammals, and particularly humans, it is expected that the dosage 
level of the active agent will be from 0.01 ug/kg to 10 ug/kg, typically around 1 ug/kg. 
The physician in any event will determine the actual dosage which will be most suitable 
for an individual and will vary with the age, weight and response of the particular 
5 individual. The above dosages are exemplary of the average case. There can, of course, 
be individual instances where higher or lower dosage ranges are merited, and such are 
within the scope of this invention. 

The dosage range required depends on the choice of peptide, the route of administration, the 
10 nature of the formulation, the nature of the subject's condition, and the judgment of the 
attending practitioner. Suitable dosages, however, are in the range of 0.1-100 ug/kg of 
subject. 

A vaccine composition is conveniently in injectable form. Conventional adjuvants may be 
1 5 employed to enhance the immune response. A suitable unit dose for vaccination is 0.5-5 
microgram/kg of antigen, and such dose is preferably administered 1-3 times and with an 
interval of 1-3 weeks. With the indicated dose range, no adverse toxicological effects will 
be observed with the compounds of the invention which would preclude their 
administration to suitable individuals. 

20 

Wide variations in the required dosage, however, are to be expected in view of the variety of 
compounds available and the differing efficiencies of various routes of administration. For 
example, oral administration would be expected to require higher dosages than 
administration by injection. Variations in these dosage levels can be adjusted using standard 
25 empirical routines for optimization, as is well understood in the art. 

Sequence Databases. Sequences in a Tangible Medium, and Algorithms 
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Polynucleotide and polypeptide sequences form a valuable information resource with which 
to determine their 2- and 3-dimensional structures as well as to identify further sequences of 
similar homology. These approaches are most easily facilitated by storing the sequence in a 
computer readable medium and then using the stored data in a known macromolecular 
structure program or to search a sequence database using well known searching tools, such 
as the GCG program package. 

Also provided by the invention are methods for the analysis of character sequences or 
strings, particularly genetic sequences or encoded protein sequences. Preferred methods 
of sequence analysis include, for example, methods of sequence homology analysis, such 
as identity and similarity analysis, DNA, RNA and protein structure analysis, sequence 
assembly, cladistic analysis, sequence motif analysis, open reading frame determination, 
nucleic acid base calling, codon usage analysis, nucleic acid base trimming, and 
sequencing chromatogram peak analysis. 



15 



A computer based method is provided for performing homology identification. This 
method comprises the steps of: providing a first polynucleotide sequence comprising the 
sequence of a polynucleotide of the invention in a computer readable medium; and 
comparing said first polynucleotide sequence to at least one second polynucleotide or 
20 polypeptide sequence to identify homology. 

A computer based method is also provided for performing homology identification, said 
method comprising the steps of: providing a first polypeptide sequence comprising the 
sequence of a polypeptide of the invention in a computer readable medium; and 
25 comparing said first polypeptide sequence to at least one second polynucleotide or 
polypeptide sequence to identify homology. 



t 
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All publications and references, including but not limited to patents and patent 
applications, cited in this specification are herein incorporated by reference in their 
entirety as if each individual publication or reference were specifically and individually 
indicated to be incorporated by reference herein as being fully set forth. Any patent 
application to which this application claims priority is also incorporated by reference 
herein in its entirety in the manner described above for publications and references. 



10 



DEFINITIONS 



15 



20 



25 



"Identity," as known in the art, is a relationship between two or more polypeptide sequences 
or two or more polynucleotide sequences, as the case may be, as determined by comparing 
the sequences. In the art, "identity" also means the degree of sequence relatedness between 
polypeptide or polynucleotide sequences, as the case may be, as determined by the match 
between strings of such sequences. "Identity" can be readily calculated by known 
methods, including but not limited to those described in {Computational Molecular 
Biology, Lesk, A.M., ed., Oxford University Press, New York, 1988; Biocomputing: 
Informatics and Genome Projects, Smith, D.W., ed., Academic Press, New York, 1993; 
Computer Analysis of Sequence Data, Part I, Griffin, A.M., and Griffin, H.G., eds., 
Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heine, 
G., Academic Press, 1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, J., 
eds-, M Stockton Press, New York, 1991 ; and Carillo, H., and Lipman, D., SIAM J. 
Applied Math., 48: 1073 (1988). Methods to determine identity are designed to give the 
largest match between the sequences tested. Moreover, methods to determine identity are 
codified in publicly available computer programs. Computer program methods to 
determine identity between two sequences include, but are not limited to, the GAP 
program in the GCG program package (Devereux, J., et al., Nucleic Acids Research 12(1): 
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387 (1984)), BLASTP, BLASTN (Altschul, S.F. et aL, J. Molec. Biol. 215: 403-410 
(1990), and FASTA( Pearson and Lipman Proc. Natl. Acad. Sci. USA 85; 2444-2448 
(1988). The BLAST family of programs is publicly available from NCBI and other 
sources (BLAST Manual, Altschul, S., et aL , NCBI NLM NIH Bethesda, MD 20894; 
5 Altschul, S., et aL, J. MoL Biol. 215: 403-410 (1990). The well known Smith Waterman 
algorithm may also be used to determine identity. 

Parameters for polypeptide sequence comparison include the following: 
Algorithm: Needleman and Wunsch, J. Mol Biol. 48: 443-453 (1970) 
10 Comparison matrix: BLOSSUM62 from Henikoff and Henikoff, 
Proc. Natl. Acad. Sci. USA. 89:10915-10919 (1992) 
Gap Penalty: 8 
Gap Length Penalty: 2 

A program useful with these parameters is publicly available as the "gap" program from 
15 Genetics Computer Group, Madison WI. The aforementioned parameters are the default 
parameters for peptide comparisons (along with no penalty for end gaps). 

Parameters for polynucleotide comparison include the following: 
Algorithm: Needleman and Wunsch, J. Mol Biol. 48: 443-453 (1970) 
20 Comparison matrix: matches == +10, mismatch = 0 
Gap Penalty: 50 
Gap Length Penalty: 3 

Available as: The "gap" program from Genetics Computer Group, Madison WI. These 
are the default parameters for nucleic acid comparisons. 

25 

A preferred meaning for "identity" for polynucleotides and polypeptides, as the case may 
be, are provided in (1) and (2) below. 
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(1) Polynucleotide embodiments further include an isolated polynucleotide 
comprising a polynucleotide sequence having at least a 50, 60, 70, 80, 85, 90, 95, 97 or 
100% identity to the reference sequence of SEQ ID NO:l or 3, wherein said 
polynucleotide sequence may be identical to the reference sequence of SEQ ID NO:l or 3 
or may include up to a certain integer number of nucleotide alterations as compared to the 
reference sequence, wherein said alterations are selected from the group consisting of at 
least one nucleotide deletion, substitution, including transition and transversion, or 
insertion, and wherein said alterations may occur at the 5' or 3' terminal positions of the 
reference nucleotide sequence or anywhere between those terminal positions, interspersed 
either individually among the nucleotides in the reference sequence or in one or more 
contiguous groups within the reference sequence, and wherein said number of nucleotide 
alterations is determined by multiplying the total number of nucleotides in SEQ ID NO:l 
or 3 by the integer defining the percent identity divided by 100 and then subtracting that 
product from said total number of nucleotides in SEQ ID NO: 1 or 3, or: 

n n ^ x n - (x n • y), 

wherein n n is the number of nucleotide alterations, x n is the total number of nucleotides 
in SEQ ID NO:l or 3, y is 0.50 for 50%, 0.60 for 60%, 0.70 for 70%, 0.80 for 80%, 0.85 
for 85%, 0.90 for 90%, 0.95 for 95%, 0.97 for 97% or 1.00 for 100%, and • is the symbol 
for the multiplication operator, and wherein any non-integer product of x n and y is 
rounded down to the nearest integer prior to subtracting it from x n . Alterations of a 
polynucleotide sequence encoding the polypeptide of SEQ ID NO:2 or 4 may create 
nonsense, missense or frameshift mutations in this coding sequence and thereby alter the 
polypeptide encoded by the polynucleotide following such alterations. 

By way of example, a polynucleotide sequence of the present invention may be identical 
to the reference sequence of SEQ ID NO:l or 3, that is it may be 100% identical, or it 
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may include up to a certain integer number of nucleic acid alterations as compared to the 
reference sequence such that the percent identity is less than 100% identity. Such 
alterations are selected from the group consisting of at least one nucleic acid deletion, 
substitution, including transition and transversion, or insertion, and wherein said 

5 alterations may occur at the 5' or 3* terminal positions of the reference polynucleotide 
sequence or anywhere between those terminal positions, interspersed either individually 
among the nucleic acids in the reference sequence or in one or more contiguous groups 
within the reference sequence. The number of nucleic acid alterations for a given percent 
identity is determined by multiplying the total number of nucleic acids in SEQ ID NO: 1 

10 or 3 by the integer defining the percent identity divided by 100 and then subtracting that 
product from said total number of nucleic acids in SEQ ID NO:l or 3, or: 

n n < x n - (x n • y), 

15 wherein n n is the number of nucleic acid alterations, x n is the total number of nucleic 
acids in SEQ ID NO:l or 3, y is, for instance 0.70 for 70%, 0.80 for 80%, 0.85 for 85% 
etc., • is the symbol for the multiplication operator, and wherein any non-integer product 
of x n and y is rounded down to the nearest integer prior to subtracting it from x n . 

20 (2) Polypeptide embodiments further include an isolated polypeptide comprising a 
polypeptide having at least a 50,60, 70, 80, 85, 90, 95, 97 or 100% identity to a 
polypeptide reference sequence of SEQ ID NO:2 or 4, wherein said polypeptide sequence 
may be identical to the reference sequence of SEQ ID NO:2 or 4 or may include up to a 
certain integer number of amino acid alterations as compared to the reference sequence, 

25 wherein said alterations are selected from the group consisting of at least one amino acid 
deletion, substitution, including conservative and non-conservative substitution, or 
insertion, and wherein said alterations may occur at the amino- or carboxy-terminal 
positions of the reference polypeptide sequence or anywhere between those terminal 
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positions, interspersed either individually among the amino acids in the reference 
sequence or in one or more contiguous groups within the reference sequence, and wherein 
said number of amino acid alterations is determined by multiplying the total number of 
amino acids in SEQ ID NO:2 or 4 by the integer defining the percent identity divided by 
100 and then subtracting that product from said total number of amino acids in SEQ ID 
NO:2or4, or: 

n a < x a - (x a • y), 

wherein n a is the number of amino acid alterations, x a is the total number of amino acids 
in SEQ ID NO:2 or 4, y is 0.50 for 50%, 0.60 for 60%, 0.70 for 70%, 0.80 for 80%, 0.85 
for 85%, 0.90 for 90%, 0.95 for 95%, 0.97 for 97% or 1.00 for 100%, and • is the symbol 
for the multiplication operator, and wherein any non-integer product of x a and y is 
rounded down to the nearest integer prior to subtracting it from x a . 

By way of example, a polypeptide sequence of the present invention may be identical to 
the reference sequence of SEQ ID NO:2 or 4, that is it may be 100% identical, or it may 
include up to a certain integer number of amino acid alterations as compared to the 
reference sequence such that the percent identity is less than 100% identity. Such 
alterations are selected from the group consisting of at least one amino acid deletion, 
substitution, including conservative and non-conservative substitution, or insertion, and 
wherein said alterations may occur at the amino- or carboxy-tenninal positions of the 
reference polypeptide sequence or anywhere between those terminal positions, 
interspersed either individually among the amino acids in the reference sequence or in one 
or more contiguous groups within the reference sequence. The number of amino acid 
alterations for a given % identity is determined by multiplying the total number of amino 
acids in SEQ ID NO:2 or 4 by the integer defining the percent identity divided by 100 and 
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then subtracting that product from said total number of amino acids in SEQ ID NO:2 or 4, 
or: 

n a < x a - (x a • y), 

5 

wherein n a is the number of amino acid alterations, x a is the total number of amino acids 
in SEQ ID NO:2 or 4, y is, for instance 0.70 for 70%, 0.80 for 80%, 0.85 for 85% etc., 
and • is the symbol for the multiplication operator, and wherein any non-integer product 
of x a and y is rounded down to the nearest integer prior to subtracting it from x a . 

10 

"Individuals)," when used herein with reference to an organism, means a multicellular 
eukaryote, including, but not limited to a metazoan, a mammal, an ovid, a bovid, a simian, 
a primate, and a human. 

15 "Isolated" means altered "by the hand of man" from its natural state, i.e., if it occurs in 
nature, it has been changed or removed from its original environment, or both. For example, 
a polynucleotide or a polypeptide naturally present in a living organism is not "isolated," but 
the same polynucleotide or polypeptide separated from the coexisting materials of its natural 
state is "isolated", as the term is employed herein. Moreover, a polynucleotide or 

20 polypeptide that is introduced into an organism by transformation, genetic manipulation or 
by any other recombinant method is "isolated" even if it is still present in said organism, 
which organism may be living or non-living. Similarly, a polynucleotide or polypeptide 
whose expression is specifically altered by genetic manipulation is "isolated" even though 
the polynucleotide or polypeptide may be present in the organism in which it is naturally" 

25 present. 
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,, Polynucleotide(s) M generally refers to any polyribonucleotide or polydeoxyribonucleotide, 
which may be unmodified RNA or DNA or modified RNA or DNA including single and 
double-stranded regions. 

5 "Variant" refers to a polynucleotide or polypeptide that differs from a reference 

polynucleotide or polypeptide, but retains essential properties. A typical variant of a 
polynucleotide differs in nucleotide sequence from another, reference polynucleotide. 
Changes in the nucleotide sequence of the variant may or may not alter the amino acid 
sequence of a polypeptide encoded by the reference polynucleotide. Nucleotide changes 

10 may result in amino acid substitutions, additions, deletions, fusions and truncations in 
the polypeptide encoded by the reference sequence, as discussed below. A typical 
variant of a polypeptide differs in amino acid sequence from another, reference 
polypeptide. Generally, differences are limited so that the sequences of the reference 
polypeptide and the variant are closely similar overall and, in many regions, identical. 

15 A variant and reference polypeptide may differ in amino acid sequence by one or more 
substitutions, additions, deletions in any combination. A substituted or inserted amino 
acid residue may or may not be one encoded by the genetic code. A variant of a 
polynucleotide or polypeptide may be a naturally occurring such as an allelic variant, or 
it may be a variant that is not known to occur naturally. Non-naturally occurring 

20 variants of polynucleotides and polypeptides may be made by mutagenesis techniques 
or by direct synthesis. 

"Disease(s) M means any disease caused by or related to infection by bacteria, including, 
for example, otitis media, acute otitis media, recurrent otitis media, otitis media with 
25 effusion, sinusitis, conjuctivitis, rhinopharyngitis, laryngitis, obstructive laryngitis, 

alveolitis, bronchitis, chronic bronchitis, enhancement of chronic obstructive pulmonary 
disease, complications of cystic fibrosis, pericarditis, endocarditis, osteomyelitis, 
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arthritis, genitouinary tract colonization and neonatal infection, bacteremia, septicemia, 
meningitis. 



EXAMPLES: 

The examples below are carried out using standard techniques, which are well known and 
routine to those of skill in the art, except where otherwise described in detail. The examples 
are illustrative, but do not limit the invention. 

Example 1: BASB070 gene from Haemophilus influenzae strain Rd KW20 and 
non-typeable Haemophilus influenzae (NTHi) strain 3224. 

A: BASB070 in Hi strain Rd. 

The BASB070 gene of SEQ ID NO:l comes from Heamophilus influenzae strain Rd 
KW20. The translation of the BASB070 polynucleotide sequence is shown in SEQ ID 
NO:2." 

B: BASB070 in NTHi strain 3224. 

The sequence of the BASB070 gene comes from the sequencing of NTHi strain 3224. 

Using the MegAlign program from the DNASTAR software package, an alignment of 
the polynucleotide sequences of SEQ ID NO:l and 3 was performed, and is displayed in 
Figure 1 ; a pairwise comparison of identities shows that the two B ASB070 
polynucleotide gene sequences are 90.3 % identical. Using the same MegAlign 
program, an alignment of the polypeptide sequences of SEQ ID NO:2 and 4 was 
performed, and is displayed in Figure 2; a pairwise comparison of identities shows that 
the two BASB070 protein sequences are 90.6 % identical. These data show that the 
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BASB070 gene among the two strains NTHi 3224 and Hi RD are conserved but that 
there are also variable regions between them. 

Example 2: Construction of Plasmid to Express Recombinant BASB070 

A: Cloning of BASB070 . 

The Ncol and Asp 718 restriction sites CC ATG G and GG TAC C were engineered 
into specifically designed forward and reverse amplification primers, respectively, 
permitting directional cloning of a BASB070 PCR product into the commercially 
available E. coli expression plasmid pBADglll(A) (Invitrogen, USA, ampicillin 
resistant). This plasmid provides the signal peptide from the bacteriophage fd pill 
protein such that a mature BASB070 protein can be targeted to the periplasm of E. coll 
The BASB070 PCR product was purified from the amplification reaction using Wizard 
PCR prepTM (Promega) according to the manufacturers instructions. To produce the 
required Ncol and Asp 718 termini necessary for cloning, purified PCR product was 
sequentially digested to completion with Ncol and Asp 718 restriction enzymes as 
recommended by the manufacturer (Boehringer Mannheim). Digested BASB070 PCR 
products and pBAD were gel-purified and ligated together using an approximately 5- 
fold molar excess of the digested fragment to the vector; A standard -20 fil ligation 
reaction (M6°C, -16 hours), using methods well known in the art, was performed using 
T4 DNA ligase (-2.0 units / reaction, Boehringer Mannheim). An aliquot of the 
ligation was used to transform electro-competent E. coli Top 10 cells according to 
methods well known in the art. Following a -2-3 hour outgrowth period at 37°C in 
-1.0 ml of LB broth, transformed cells were plated on LB agar plates containing 
Ampicillin (50 jig/ml). Individual ampicillin-resistant colonies were selecteded and 
analyzed by whole cell-based PCR to verify that transformants contained the BASB070 
DNA insert. Transformants that produced the expected PCR product were identified as 
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strains containing a BASB070 expression construct. Expression plasmid containing 
strains were then analyzed for the inducible expression of recombinant BASB070. 

B: Expression Analysis of PCR-Positive Transformants. 
5 For each PCR-positive transformant identified above, -5.0 ml of LB broth containing 
ampicillin (50 ng/ml) was inoculated with cells from the patch plate and grown 
overnight at 37 °C with shaking (-250 rpm). An aliquot of the overnight seed culture 
(-1.0 ml) was inoculated into a 125 ml erlenmeyer flask containing -25 ml of LB 
AMPICILLINE broth and grown at 37 °C with shaking (-250 rpm) until the culture 

10 turbidity reached O.D.600 of -0.5, i.e. mid-log phase (usually about 1 .5 - 2.0 hours). At 
this time approximately half of the culture (-12.5 ml) was transferred to a second 125 
ml flask and expression of recombinant BASB070 protein induced by the addition of L- 
Arabinose to a final concentration of 0.2 % (w/v). Incubation of both the arabinose- 
induced and non-induced cultures continued for an additional -4 hours at 37 °C with 

1 5 shaking. Samples (-1 .0 ml) of both induced and non-induced cultures were removed 
after the induction period and the cells collected by centrifugation in a microcentrifuge 
at room temperature for -3 minutes. Individual cell pellets were suspended in -50|ul of 
sterile water, then mixed with an equal volume of 2X Laemelli SDS-PAGE sample 
buffer containing 2-mercaptoethanol, and placed in boiling water bath for -3 min to 

20 denature protein. Equal volumes (~15jil) of both the crude arabinose-induced and the 
non-induced cell lysates were loaded onto duplicate 12% Tris/glycine polyacrylamide 
gel (1 mm thick Mini-gels, Novex). The induced and non-induced lysate samples were 
electrophoresed together with prestained molecular weight markers under conventional 
conditions using a standard SDS/Tris/glycine running buffer. Following 

25 electrophoresis, one gel was stained with commassie brilliant blue R250 (BioRad) and 
then destained to visualize novel BASB070 arabinose-inducible protein(s). 
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NTHi Strains 

The following strains of Haemophilus influenzae are provided as a useful reference for the 
present invention. The BASB070 gene utilised in accordance with the invention is not limited 
5 with regard to the strain, but it may correspond to theBASB070 gene as found in any of the strains 
listed below or any related strain. This information is provided merely for convenience to those of 
skill in the art and is not an admission that any provision of a deposit is required for enablement. 

strain 32 19C (ET7) 
10 strain 3241 A (ET30) 
strain 840645 (ET51) 
strain 901905U (ET60) 
strain A840177 (ET40) 
strain A840177 (ET69) 

15 

All of the above strains were described in van Alphen, L., Caugant, D.A, Duim, B.A., O'Rouke, M., 
Bowler, L.D. (1997) Differences in genetic diversity of non-encapsulated^, influenzae from 
various diseases. Microbiology, 143: 1423-1431. 

20 HiRd Strains 

An example of a HiRd strain is described in R.D. Fleischmann et al., Science . Vol 269: 496-512 
(1995) and K. W. Wilcox et al., J. Bact. Vol 122: 443 (1975) with the strain name KW20. This 
strain was deposited by the authors with the American Type Culture Collection under deposit 
25 number ATCC 51907. 
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SEQUENCE INFORMATION 

BASB070 Polynucleotide and Polypeptide Sequences 
SEQ ID NO:l 

5 Haemophilus influenzae BASB070 polynucleotide sequence from strain Rd KW20 

ATGAAGAAAGCTATAAAATTAAATTTAATTACACTTGGCCTAATTAATACGATCGGTATGACGATTACACAAGCTCAAGC 
CGAAGAAACATTAGGACAAATTGATGTAGTGGAAAAAGTTATATCAAACGATAAAAAACCTTTCACTGAAGCCAAAGCCA 
AAAGTACACGTGAAAATGTCTTTAAGGAAACACAAACCATTGACCAAGTGATTCGAAGTATCCCTGGTGCATTTACTCAA 
CAAGATAAAGGCTCGGGTGTCGTTTCTGTGAATATTCGTGGCGAAAATGGATTAGGTCGTGTCAATACTATGGTTGATGG 

10 TGTAACACAGACCTTCTATTCTACAGCCTTAGACTCAGGTCAATCAGGCGGAAGTTCTCAATTTGGTGCGGCAATCGATC 
CTAATTTTATTGCAGGTGTAGATGTTAATAAAAGCAACTTTTCAGGAGCAAGCGGTATAAATGCGTTAGCAGGCAGTGCT 
AATTTTAGAACATTAGGCGTTAATGATGTTATTACCGATGACAAACCATTTGGCATTATTCTGAAAGGAATGACAGGGAG 
TAATGCCACTAAATCCAATTTTATGACAATGGCTGCTGGCAGAAAATGGCTTGATAATGGTGGCTATGTAGGCGTGGTGT 
ATGGTTATAGCCAACGTGAAGTATCTCAAGATTACCGTATCGGTGGCGGAGAACGATTAGCATCATTAGGGCAGGATATT 

1 5 CTCGCGAAAGAAAAAGAAGCTTATTTTCGTAATGCGGGTTATATTTTAAATCCTGAAGGGCAATGGACACCTGATTTAAG 
CAAAAAACATTGGTCTTGTAACAAACCAGATTATCAGAAAAATGGTGATTGTAGTTATTATCGTATTGGATCTGCTGCAA 
AGACTAGAAGAGAAATTCTACAAGAATTATTAACAAATGGAAAAAAACCTAAGGATATTGAAAAGCTCCAAAAAGGTAAT 
GATGGAATTGAAGAAACTGACAAATCATTTGAACGTAATAAAGATCAATATAGTGTTGCACCGATTGAGCCGGGTAGTTT 
GCAATCTCGTTCTCGTAGCCATTTATTAAAATTTGAATATGGCGATGATCACCAAAATTTAGGGGCGCAATTACGCACGT 

20 

TGGATAATAAAATTGGTTCTCGCAAAATTGAAAACCGTAATTACCAAGTCAATTATAACTTCAATAATAACAGCTATCTT 
GATCTTAATTTAATGGCTGCACATAACATTGGAAAAACTATTTATCCTAAAGGCGGTTTTTTTGCTGGCTGGCAAGTGGC 
AGATAAACTTATCACTAAAAATGTCGCAAATATTGTTGATATAAACAACAGCCATACTTTCTTACTGCCAAAAGAAATTG 
ATT T AAAAAC C AC AT T AGGTTTT AACT ATTTT AC C AATG AAT AC AGT AAAAAC CGT TT TC C AG AAG AAT T AAGTT TG TTT 
TATAACGATGCTTCACATGATCAAGGCTTATATTCACACAGTAAAAGAGGGCGATATTCTGGCACAAAAAGTTTATTACC 

25 

ACAACGTTCAGTAATCTTACAACCTTCTGGCAAGCAAAAATTTAAAACCGTGTATTTTGATACCGCACTTTCTAAAGGCA 
TTTATCATTTAAATTACAGCGTGAATTTTACCCATTATGCCTTTAATGGTGAGTATGTAGGTTACGAAAATACAGCGGGT 
C AAC AAAT T AAT GAACCT ATT T T GC AT AAATC AGGGC AT AAAAAGGC AT T C AATC AT TCT G C C AC AT T AAGT GC AG AAC T 
GAGTGATTATTTTATGCCATTTTTTACTTATTCACGCACTCACAGAATGCCGAATATTCAAGAGATGTTTTTCTCTCAAG 
TGTCTAATGCAGGGGTAAACACAGCATTAAAACCTGAACAATCTGACACCTATCAACTAGGCTTTAATACTTATAAAAAA 

30 GGTCTCTTCACTCAAGACGATGTGCTAGGCGTAAAATTAGTAGGCTATCGTAGCTTTATTAAAAACTATATCCATAATGT 
TTATGGTGTTTGGTGGCGAGATGGCATGCCTACGTGGGCAGAAAGTAATGGATTTAAATATACTATTGCTCATCAAAATT 
ATAAGCCTATTGTGAAAAAGAGCGGCGTCGAGTTAGAAATTAACTATGACATGGGACGTTTTTTTGCGAATGTCTCTTAT 
GCATATCAACGAACAAATCAACCAACCAATTATGCCGATGCCAGCCCGCGTCCGAATAATGCTTCACAAGAAGACATTTT 
GAAACAAGGTTATGGCTTATCTCGTGTTTCAATGCTACCAAAAGACTACGGCAGATTAGAGCTTGGCACACGTTGGTTTG 

35 ATCAAAAATTAACCTTAGGTCTGGCAGCTCGTTATTATGGAAAAAGTAAACGTGCGACAATTGAAGAAGAATATATCAAT 
GGATCTCGCTTTAAAAAAAATACCTTGCGTCGTGAAAATTACTATGCCGTGAAAAAAACGGAAGATATTAAAAAACAACC 
GATTATTTTAGATTTACACGTCAGCTATGAACCAATCAAAGATTTGATTATTAAAGCGGAAGTACAAAATCTATTAGATA 
AACGTTATGTTGATCCGTTAGATGCTGGAAATGACGCGGCTTCGCAACGTTATTATTCAAGTTTAAATAATTCTATAGAA 
TGTGCGCAAGATTCTTCTGCTTGCGGTGGTTCAGATAAAACCGTGCTTTATAACTTTGCACGTGGAAGAACTTATATTCT 

40 GAGTTTAAACTATAAATTCTAA 
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15 



SEQ ID NO:2 

Haemophilus influenzae BASB070 polypeptide sequence deduced from the 
polynucleotide of SeQ ID NO:l 

MKKAIKLNLITLGLINTIGMTITQAQAEETLGQIDWEKVISNDKKPFTEAKAKSTRENVFKETQTIDQVIRSIPGAFTQ 
QDKGSGVVSVNIRGENGLGRVNTMVDGVTQTFYSTALDSGQSGGSSQFGAAIDPNFIAGVDVNKSNFSGASGINALAGSA 
NFRTLGVNDVITDDKPFGIILKGMTGSNATKSNFMTMAAGRKWLDNGGYVGVVYGYSQREVSQDYRIGGGERLASLGQDI 
LAKEKEAYFRNAGYILNPEGQWTPDLSKKHWSCNKPDYQKNGDCSYYRIGSAAKTRREILQELLTNGKKPKDIEKLQKGN 
DGIEETDKSFERNKDQYSVAPIEPGSLQSRSRSHLLKFEYGDDHQNLGAQLRTLDNKIGSRKIENRNYQVNYNFNNNSYL 
DLNLMAAHNIGKTIYPKGGFFAGWQVADKLITKNVANIVDINNSHTFLLPKEIDLKTTLGFNYFTNEYSKNRFPEELSLF 
YNDASHDQGLYSHSKRGRYSGTKSLLPQRSVILQPSGKQKFKTVYFDTALSKGIYHLNYSVNFTHYAFNGEYVGYENTAG 
QQINEPILHKSGHKKAFNHSATLSAELSDYFMPFFTYSRTHRMPNIQEMFFSQVSNAGVNTALKPEQSDTYQLGFNTYKK 
GLFTQDDVLGVKLVGYRSFIKNYIHNVYGVWWRDGMPTWAESNGFKYTIAHQNYKPIVKKSGVELEINYDMGRFFANVSY 
AYQRTNQPTNYADASPRPNNASQEDILKQGYGLSRVSMLPKDYGRLELGTRWFDQKLTLGLAARYYGKSKRATIEEEYIN 
GSRFKKNTLRRENYYAVKKTEDIKKQPIILDLHVSYEPIKDLIIKAEVQNLLDKRYVDPLDAGNDAASQRYYSSLNNSIE 
CAQDSSACGGSDKTVLYNFARGRTYILSLNYKF 

SEQ ID NO:3 

20 Haemophilus influenzae BASB070 polynucleotide sequence from strain ntHi 3224 

ATGAAGAAAGCTATAAAATTAAATTTAATTACACTTAGCCTAATCAATACAATCGGTATGACGATTACACAAGCTCAAGC 
CGAAGAAACATTAGGGCAAATTGATGTCGTAGAAAAAGTGATATCAAATGACAAAAAACCTTTCACTGAAGCCAAAGCCA 
AAAGTACGCGTGAAAATGTCTTTAAGGAAACACAAACCATTGACCAAGTCATTCGGAGCATTCCTGGGGCATTTACTCAA 
CAAGATAAAGGCTCGGGTGTGGTTTCTGTAAATATTCGTGGCGAAAATGGATTAGGTCGTGTCAATACGATGGTTGATGG 

25 TGTAACCCAAACCTTCTATTCTACAGCCTTAGACTCTGGTCAATCAGGCGGAAGTTCTCAATTTGGTGCGGCAATCGACC 
CTAATTTTATTGCAGGTGTAGATGTTAATAAAAGCAACTTTTCGGGAGCAAGCGGTATAAATGCCTTAGCAGGCAGTGCT 
AATTTTAGAACATTAAGCGTTAATGATGTGATTACCGATGACAAACCATTCGGCATTATTCTGAAAGGAATGACAGGGAG 
CAATGCCACTAAATCCAATTTTATGACGACAGCTGCAGGCAGAAAATGGCTTGATAATGGTGGCTATGTAGGCGTAGTGT 
ATGGTTATAGCCAACGTGAAGTTTCACAAGATTATCGTATAGGTGGCGGAGAACGATTAGCATCATTAGGGCAAGATATT 

30 CTTGCTAAAGAAAAAGAAAAGATTTTTCGTAATGATGGTTATGTTTTAAATTCTGCTGGACAATGGGCACCTGATTTAAA 
CAAACCACATTGGTCTTGTAATACCCCGAGTTCTTTAAAAGATAAAAGTATGAGTACATCTTGTAAGCCTTATCGTCTTG 
GACCTGCTGCAACGACTAGACAAGAAATTCTAAAAGAATTATTAGAAGATGGAAAAGAACCTAAGGATATTGAAAAGCTC 
CAAAAAAGTAATGATGGAATTGAAGAAACTGAAAAATCATTTGAACGTAATAAAGATCAATATGACGTCGCCCCTATTGA 
GCCTGGTAGTTTGCAATCTCGTTCACGTAGTCATTTATTAAAATTTGAATATAGCGATGATCACCATACGCTAGGGGCGC 

35 AAATACGTACCCTTGATAATAAAATTGGTTCTCGCAAAATTGAAAACCGTAATTACCAAGTCAATTATAACTTCAATAAT 
AACAGCTATCTTGATCTTAATTTAATGGCTGCACATAACATTGGCAAAACTATTTATCCTAAGGGTGGTTTTTTTGCTGG 
CTGGCAAGTGGCAGACAAACTTATCACAAAAAATGTGGCAAATATTGTTGATATAAATAACAGCCATACTTTCTTACTGC 
CAAAAGAAATCGATTTAAAAACCACATTAGGGTTTAACTATTTTACCAATGAATACAGTAAAAACCGTTTTCCAGAAGAA 
TTAAGTTTGTTTTATGTGAATGAATCACATGATCAAGGCTTATATTCACTCAGTAATAAAGGGCGATATTCTGGCTCAAA 

40 AGGTTTATTACCACAACGTTCAGTAATCTTACAACCTTCTGGCAAGCAAAAATTTAAAACAGTGTATTTTGATACCGCAC 
TTTCTAAAGGTATTTATCATTTAAATTACAGCGTGAATTTTACCCATTATGCCTTTAATGGTGAGTATGTTGGATATAAA 
AATACAGCAGATAAAATTAATGAACCTATTTTGCATAAATCAGGGCATAAAAAGGCATTCAATCATTCTGCTACATTAAG 
TGCAGAGCTAAGTGATTATTTTATGCCATTTTTTACTTATTCACGCACACACAGAATGCCGAATATTCAAGAGATGTTTT 

58 
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TCTCTCAAGTGTCTGATGCTGGGGTAAACACCGCATTAAAACCTGAACAATCTGACACCTATCAACTAGGCTTTAATACT 
TATAAAAAAGGTCTATTCACTCAAGACGATGTATTAGGCATCAAATTAGTGGGCTATCGTAGCTTTATTAAAAACTATAT 
CCACAATGTGTATGGAGATTGGTCACGAGATGGTGTTATGCCAGAGTGGGCAAGACTCAATGGTTTTCGTCTGACGATTG 
CTCATCAAAATTATCAACCAATAGTGAAAAAAAGCGGAGCTGAGTTAGAGCTCAATTATGATATGGGGCGTTTTTTTGCA 
5 AATCTGTCTTATGCTTATCAACGTACTAATCAGCCAACCAATTATGCCGATGCCAGCTCACGTCCGCGTAATGCTTCAAA 
AGAAGAGATTTTGAAACAAGGTTATGGTTTATCACGAATCTCTATGTTACCAAAGGACTACGGTAGATTAGAGCTTGGCA 
CACGCTGGTTTGATCAAAAATTAACTCTTGGTATCGCAGCCCGTTACTATGGAAAAAGTAAACGTGCTACAACTCAAGAA 
GAATACATCAACGGCTCTCGCTATGAAAAAAATACTACGCGCGACAGAATTTATTATGCTATTAAAAAGACAGAAGAGAT 
TAAAAAACJ\ACCTATTATTTTAGATTTACACGTCAGCTATGAACCAATCAAAGATTTGATTATTAAAGCGGAAGTACAAA 
10 ATCTATTAGATAAACGTTATGTTGATCCGTTAGATGCTGGAAATGATGCGGCTTCGCAACGTTATTATTCAAGTTTAAAT 
GATTCTTTAGCCTGTAAAATAAATGAATCAACCTGTAATGATGGTTCAGAGAAAACTGTGCTTTATAACTTTGCACGTGG 
AAGAACTTATATTCTGAGTTTGAACTATAAATTCTAG 



SEQ ID NO:4 

15 Haemophilus influenzae BASB070 polypeptide sequence deduced from the 
polynucleotide of SeQ ID NO:3 

MKKAIKLNLITLSLINTIGMTITQAQAEETLGQIDWEKVISNDKKPFTEAKAKSTRENVFKETQTIDQVIRSIPGAFTQ 
QDKGSGVV5VNIRGENGLGRVNTMVDGVTQTFYSTALDSGQSGGSSQFGAAIDPNFIAGVDVNKSNFSGASGINALAGSA 
NFRTLSVNDVITDDKPFGIILKGMTGSNATKSNFMTTAAGRKWLDNGGYVGVVYGYSQREVSQDYRIGGGERLASLGQDI 

20 LAKEKEKIFRNDGYVLNSAGQWAPDLNKPHWSCNTPSSLKDKSMSTSCKPYRLGPAATTRQEILKELLEDGKEPKDIEKL 
QKSNDGIEETEKSFERNKDQYDVAPIEPGSLQSRSRSHLLKFEYSDDHHTLGAQIRTLDNKIGSRKIENRNYQVNYNFNN 
NSYLDLNLMAAHNIGKTIYPKGGFFAGWQVADKLITKNVANIVDINNSHTFLLPKEIDLKTTLGFNYFTNEYSKNRFPEE 
LSLFYVNESHDQGLYSLSNKGRYSGSKGLLPQRSVILQPSGKQKFKTVYFDTALSKGIYHLNYSVNFTHYAFNGEYVGYK 
NTADKINEPILHKSGHKKAFNHSATLSAELSDYFMPFFTYSRTHRMPNIQEMFFSQVSDAGVNTALKPEQSDTYQLGFNT 

25 YKKGLFTQDDVLGIKLVGYRSFIKNYIHNVYGDWSRDGVMPEWARLNGFRLTIAHQNYQPIVKKSGAELELNYDMGRFFA 
NLSYAYQRTNQPTNYADASSRPRNASKEEILKQGYGLSRISMLPKDYGRLELGTRWFDQKLTLGIAARYYGKSKRATTQE 
EYINGSRYEKNTTRDRIYYAIKKTEEIKKQPIILDLHVSYEPIKDLIIKAEVQNLLDKRYVDPLDAGNDAASQRYYSSLN 
DSLACKINESTCNDGSEKTVLYNFARGRTYILSLNYKF 
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CLAIMS 

1 . A vaccine composition comprising an effective amount of a polypeptide which 
polypeptide comprises an amino acid sequence which has at least 85% identity to the 
amino acid sequence of SEQ ID NO: 2 or 4 or to an immunogenic fragment thereof, or 
which polypeptide comprises a mimotope of the said amino acid sequence or 
immunogenic fragment, together with a pharmaceutically acceptable carrier. 

2. A vaccine composition according to claim 1 wherein the amino acid sequence 
has at least 95% identity to the amino acid sequence of SEQ ID NO: 2 or 4 or to an 
immunogenic fragment thereof. 

3. A vaccine composition comprising an effective amount of a polynucleotide which 
polynucleotide comprises a nucleotide sequence which has at least 85% identity to the 
nucleotide sequence of SEQ ID NO: 1, 3 or to a fragment thereof which encodes an 
immunogenic polypeptide, together with a pharmaceutically acceptable carrier. 

4. The vaccine composition according to any one of claims 1 to 3 wherein said 
composition comprises at least one other Haemophilus influenzae antigen. 

5. An expression vector or a recombinant live microorganism comprising an isolated 
polynucleotide which polynucleotide comprises a nucleotide sequence which has at least 
85% identity to the amino acid sequence of SEQ ID NO: 1 or 3 or to a fragment thereof 
that encodes an immunogenic polypeptide. 

6. A host cell comprising the expression vector of claim 5 or a membrane of said 
host cell expressing an isolated polypeptide comprising an amino acid sequence which 
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has at least 85% identity to the amino acid sequence of SEQ ID NO: 2 or 4, or to an 
immunogenic fragment thereof. 

7. A process for producing a polypeptide comprising an amino acid sequence that has 
5 at least 85% identity to the amino acid sequence of SEQ ID NO: 2 or 4 or to an 

immunogenic fragment, comprising cuituring a host cell of claim 6 under conditions 
sufficient for the production of said polypeptide and recovering the polypeptide from the 
culture medium. 

10 8. A process for expressing a polynucleotide, which polynucleotide comprises a 
nucleotide sequence which has at least 85% identity to the nucleotide sequence of SEQ 
ID NO: 1 or 3 or to a fragment thereof that encodes an immunogenic polypeptide, the 
process comprising transforming a host cell with an expression vector comprising said 
polynucleotide and cuituring said host cell under conditions sufficient for expression of 

15 said polynucleotide. 

9. An antibody specific for the polypeptide of SEQ ID NO: 2 or 4 or an 
immunologically active fragment of the antibody. 

20 10. A method of diagnosing a Haemophilus influenzae infection, comprising identifying 
a polypeptide which comprises an amino acid sequence which has at least 85% identity to 
the amino acid sequence of SEQ ID NO: 2or 4 or a fragment thereof, or an antibody that 
is specific for said polypeptide, present within a biological sample from an animal 
suspected of having such an infection. 

25 

1 1 . Use of a composition comprising an immunologically effective amount of a 
polypeptide which comprises an amino acid sequence which has at least 85% identity to 
the amino acid sequence of SEQ ID NO: 2 or 4 or to an immunogenic fragment thereof, or 
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which polypeptide comprises a mimotope of the said amino acid sequence or 
immunogenic fragment, in the preparation of a medicament for use in generating an 
immune response in a mammal. 

12. Use of a composition comprising an immunologically effective amount of a 
polynucleotide which comprises a nucleotide sequence which has at least 85% identity to 
the nucleotide sequence of SEQ ID NO: 1 or 3 or to a fragment thereof that encodes an 
immunogenic polypeptide, in the preparation of a medicament for use in generating an 
immune response in a mammal. 

13. A therapeutic composition useful in treating humans with Haemophilus influenzae 
disease comprising at least one antibody directed against the polypeptide of SEQ ID NO: 
2 or 4 and a suitable pharmaceutical carrier. 

14. An isolated polypeptide comprising the amino acid sequence of SEQ ID NO: 4 or 
a fragment or a mimotope thereof. 

15. An isolated polynucleotide comprising a nucleotide sequence encoding the 
polypeptide of claim 14. 

16. The polynucleotide of claim 15 comprising the nucleotide sequence of SEQ ID 
NO: 3 a fragment thereof. 
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Figure 1 : Alignment of the BASB070 polynucleotide sequences. 

Identity to SeqID No:l is indicated by a dot and Gap is indicated by a dash. 



5 * 20 * 

Seqidl : ATGAAGAAAGCTATAAAATTAAATTTAATT : 3 0 

Seqid3 : . 30 

10 40 * 60 

Seqidl : ACACTTGGCCTAATTAATACGATCGGTATG : 60 

Seqid3 : A C A. : 60 

15 * 80 * 

Seqidl : ACGATTACACAAGCT CAAGC CGAAGAAACA : 90 

Seqid3 : . 90 

20 100 * 120 

Seqidl : TTAGGACAAATTGATGTAGTGGAAAAAGTT : 120 

Seqid3 : G c. .A G : 120 

25 * 14 0 * 

Seqidl : ATATCAAACGATAAAAAACCTTTCACTGAA : 150 

Seqid3 : T. .C : 150 

30 160 * 180 

Seqidl : GCCAAAGCCAAAAGTACACGTGAAAATGTC : 18 0 

Seqid3 : G : 180 
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10 



15 



20 



25 



Seqidl 
Seqid3 



240 
240 



* 200 * 
Seqidl : TTTAAGGAAACACAAACCATTGACCAAGTG : 210 
Seqid3 : c . 2 10 

220 * 240 
Seqidl : ATTCGAAGTATCCCTGGTGCATTTACTCAA : 
Seqid3 : G..C..T G : 

* 260 * 
Seqidl : CAAGATAAAGGCTCGGGTGTCGTTTCTGTG : 2 70 
Seqid3 : G A : 270 

2 80 * 3 00 

AATATT CGTGGCGAAAATGGATTAGGT CGT : 

* 320 * 
Seqidl : GTCAATACTATGGTTGATGGTGTAACACAG : 33 0 
Seqid3 : G C. .A : 330 

340 * 360 
Seqidl : ACCTTCTATTCTACAGCCTTAGACTCAGGT : 3 60 
Seqid3 : T... : 360 



300 
300 



Seqidl 
Seqid3 



* 380 * 

CAAT CAGGCGGAAGTTCT CAATTTGGTG CG 



390 
390 
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10 



15 



20 



25 



400 * 420 
Seqidl : GCAATCGATCCTAATTTTATTGCAGGTGTA : 42 0 
Seqid3 : C : 420 



* 440 * 
Seqidl : GATGTTAATAAAAGCAACTTTTCAGGAGCA : 450 
Seqid3 : G : 450 



460 * 480 
Seqidl : AGCGGTATAAATGCGTTAGCAGGCAGTGCT : 480 
Seqid3 : c : 480 



* 500 * 
Seqidl : AATTTTAGAACATTAGGCGTTAATGATGTT : 510 
Seqid3 : A G : 510 



520 * 540 
Seqidl : ATT AC CGATGACAAAC CATTT GGCATT ATT : 54 0 
Seqid3 : C : 540 



* 560 * 
Seqidl : CTGAAAGGAATGACAGGGAGTAATGCCACT : 570 
Seqid3 : C : 570 



580 * 600 
Seqidl : AAATCCAATTTTATGACAATGGCTGCTGGC : 6 00 
Seqid3 : G.CA A... : 600 
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Seqidl 
Seqid3 



* 620 * 

AGAAAATGG CTTGATAATGGTGG CT ATGTA 



630 
630 



10 



640 * 660 
Seqidl : GGCGTGGTGTATGGTTATAGCCAACGTGAA : 6 60 
Seqid3 : A : 660 



* 680 * 
Seqidl : GTATCTCAAGATTACCGTATCGGTGGCGGA : 6 90 
Seqid3 : ..T..A T A : 690 



20 



700 * 720 
Seqidl : GAACGATTAGCATCATTAGGGCAGGATATT : 72 0 
Seqid3 : A : 720 



25 



* 740 * 
Seqidl : CTCGCGAAAGAAAAAGAAGCTTATTTTCGT : 75 0 
Seqid3 : ..T..T AAGAT : 750 



30 



760 * 780 
Seqidl : AATGCGGGTTATATTTTAAATC CTGAAGGG : 7 80 
Seqid3 : ... .AT G T. . .CT. .A : 780 



* 800 * 

Seqidl : CAATGGACAC CTGATTTAAGCAAAAAACAT 



810 
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G A. . . .CC. . . . : 810 



820 * 840 

5 Seqidl : TGGTCTTGTAACAAACCAGATTATCAGAAA : 84 0 
Seqid3 : T . CC . . GAG . . C . TTA. . . : 840 



* 860 * 

10 Seqidl : AATGGTGAT TGTAGTTAT : 85 8 

Seqid3 : G . . AAAAG . ATGAGTACATCT .... AGCC . : 870 



880 * 900 
(5 Seqidl : TATCGTATTGGATCTGCTGCAAAGACTAGA : 888 
Seqid3 : C C C : 900 



20 



25 



* 920 * 
Seqidl : AGAGAAATT CTACAAGAATT ATTAACAAAT : 918 
Seqid3 : CA A GA.G. . : 930 

940 * 960 

Seqidl : GGAAAAAAAC CTAAGGAT ATTGAAAAGCT C : 94 8 

Seqid3 : G : 960 

* 980 * 
Seqidl : CAAAAAGGTAATGATGGAATTGAAGAAACT : 97 8 
Seqid3 : A : 990 
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1000 * 1020 
Seqidl : GACAAATCATTTGAACGTAATAAAGATCAA : 10 08 
Seqid3 : . .A . 102 q 



* 1040 * 
Seqidl : T ATAGTGTTG CAC CGATT G AGC C GGGT AGT : 103 8 
Seqid3 : . . . GAC . . C . . C . . T T : 1050 

10 

1060 * 1080 
Seqidl : TTGCAATCTCGTTCTCGTAGCCATTTATTA : 1068 
Seqid3 : A T : 1080 

15 

* 1100 * 
Seqidl : AAATTTGAATATGGCGATGATCACCAAAAT : 10 9 8 
Seqid3 : a T . CG : 1110 

20 

1120 * 1140 
Seqidl : TTAGGGGCGCAATTACGCACGTTGGATAAT : 112 8 
Seqid3 : C A. ...T..CC.T : 1140 

25 

* 1160 * 
Seqidl : AAAATTGGTTCTCGCAAAATTGAAAACCGT : 115 8 
Seqid3 : : H70 

30 



Seqidl 
Seqid3 



1180 * 1200 
AATT AC CAAGT CAATTATAACTT CAAT AAT : 118 8 
: 1200 
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10 



15 



20 



25 



Seqidl 
Seqid3 



Seqidl 
Seqid3 



Seqidl 
Seqid3 



Seqidl 
Seqid3 



Seqidl 
Seqid3 



30 



Seqidl 
Seqid3 



* 1220 * 
AACAGCTATCTTGATCTTAATTTAATGGCT 

1240 * 1260 
G CAC ATAACATTGGAAAAACT ATTTAT C CT 
C 

* 1280 * 
AAAGGCGGTTTTTTTGCTGGCTGGCAAGTG 
. .G. .T 

1300 * 1320 
GCAGATAAACTTATCACTAAAAATGTCGCA 
C A G. . . 

* 1340 * 
AATATTGTTGATATAAACAACAGCCATACT 
T 

1360 * 1380 
TT CTTACTG C C AAAAGAAATTGATTT AAAA 
C 



1218 
1230 



1248 
1260 



1278 
1290 



1308 
1320 



1338 
1350 



1368 
1380 



* 1400 * 

Seqidl : AC CACATTAGGTTTTAACTATTTTACCAAT 
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Seqid3 



1410 



Seqidl 
Seqid3 



1420 * 1440 

GAAT ACAGT AAAAAC CGTTTT C C AGAAGAA 



1428 
1440 



* 1460 * 
io Seqidl : TTAAGTTTGTTTTATAACGATGCTTCACAT : 1458 
Seqid3 : GTGA. . .AA : 1470 



1480 * 1500 
15 Seqidl : GATCAAGGCTTATATTCACACAGTAAAAGA : 14 8 8 
Seqid3 : T T-A _ . 1500 



* 1520 * 
20 Seqidl : GGGCGATATTCTGGCACAAAAAGTTTATTA : 1518 
Seqid3 : T G : 1530 



25 



Seqidl 
Seqid3 



1540 * 1560 

CCACAACGTTCAGTAATCTTACAACCTTCT 



1548 
1560 



* 1580 * 
Seqidl : GGCAAGCAAAAATTTAAAACCGTGTATTTT : 15 78 
Seqid3 : A : 1590 



WO 00/50599 
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1600 * 1620 
GATAC CGC ACTTT CTAAAGGCATTTATCAT : 16 0 8 
T : 1620 

* 1640 * 

TTAAATTACAGCGTGAATTTTACCCATTAT 

1660 * 1680 
GCCTTTAATGGTGAGTATGTAGGTTACGAA : 16 6 8 
T. .A. .TA. . : 16 8 0 



1638 
1650 



15 



20 



25 



* 1700 * 
Seqidl : AATACAGCGGGT CAACAAATTAATGAAC CT : 16 9 8 
Seqid3 : A. A. A : 1707 

1720 * 1740 

ATTTTGCATAAATCAGGGCATAAAAAGGCA 

* 1760 * 
Seqidl : TTCAATCATTCTGCCACATTAAGTGCAGAA : 175 8 
Seqid3 : T G : 1767 



Seqidl 
Seqid3 



1728 
1737 



30 



1780 * 1800 
Seqidl : CTGAGTGATTATTTTATGCCATTTTTTACT : 17 8 8 
Seqid3 : . .A : 1797 
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* 1820 * 
Seqidl : TATT CACG CACT CACAGAATG C CG AATATT : 1818 

5 Seqid3 : A : 1827 

1840 * 1860 

Seqidl : CAAGAGATGTTTTTCTCT CAAGTGT CTAAT : 184 8 

10 Seqid3 : G. . : 1857 

* 1880 * 
Seqidl : GCAGGGGTAAACACAGCATTAAAACCTGAA : 187 8 

15 Seqid3 : . .T C : 1887 

1900 * 1920 

Seqidl : CAATCTGACACCTATCAACTAGGCTTTAAT : 19 0 8 

20 Seqid3 : : 1917 

* 1940 * 
Seqidl : ACTTATAAAAAAGGTCTCTTCACTCAAGAC : 193 8 

25 Seqid3 : A : 1947 

1960 * 1980 

Seqidl : GATGTGCTAGGCGTAAAATTAGTAGGCTAT : 196 8 

30 Seqid3 : AT A.C G : 1977 

* 2000 * 
Seqidl : CGTAGCTTTATTAAAAACTATATCCATAAT : 19 9 8 



WO 00/50599 
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C. . . : 2007 



5 Seqidl 
Seqid3 



2020 * 2040 

GTTTATGGTGTTTGGTGGCGAGATGGC 

. .G A. A CA TGTT 



2025 
2037 



* 2060 * 
io Seqidl : ATG C CT ACGTGGG CAG AAAGTAATGGATTT : 2 05 5 
Seqid3 : AGA AG.CTC T. . . : 2 06 7 



2080 * 2100 
15 Seqidl : AAAT ATACT ATTG CT CAT CAAAATT AT AAG : 2 0 85 
Seqid3 : CGTCTG..G C.A : 2097 



* 2120 * 
20 Seqidl : C CTATTGTGAAAAAGAG CGGCGT CGAGTTA : 2115 
Seqid3 : . .A. .A A A.CT : 2127 



25 Seqidl 
Seqid3 



2140 * 2160 
GAAATTAACTATGACATGGGACGTTTTTTT : 214 5 
..GC.C..T T G : 2157 



30 Seqidl 
Seqid3 



* 2180 * 
GCGAATGTCTCTTATGCATATCAACGAACA : 217 5 
. .A. . .C.G T T. .T : 2187 



2200 



2220 



WO 00/50599 PCT/EP00/01423 

12 / 21 

Seqidl : AAT CAACGAAC CAATTATGCCGATGC CAGC : 2 2 05 
Seqid3 : G : 2217 



5 * 2240 * 

Seqidl : CCGCGTCCGAATAATGCTTCACAAGAAGAC : 223 5 
Seqid3 : T . A CG A G : 2247 



»° 2260 * 2280 

Seqidl : ATTTTGAAACAAGGTTATGGCTTATCTCGT : 2 26 5 
Seqid3 : \ T A. .A : 2277 



15 



* 2300 * 
Seqidl : GTTTCAATGCTACCAAAAGACTACGGCAGA : 22 95 
Seqid3 : A.C..T...T G T... : 2307 

20 

2320 * 2340 

Seqidl : TTAGAGCTTGGCACACGTTGGTTTGATCAA : 23 2 5 

Seqid3 : C : 2337 

25 

* 2360 * 
Seqidl : AAATTAACCTTAGGTCTGGCAGCTCGTTAT : 23 55 
Seqid3 : TC.T...A.C C C : 2367 

30 



2380 * 2400 
Seqidl : TATGGAAAAAGTAAACGTGCGACAATTGAA : 23 85 
Seqid3 : T....C.C. : 23 97 
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* 2420 * 
Seqidl . : GAAGAATATATCAATGGAT CTCGCTTTAAA : 2415 

5 Seqid3 : C.....C..C A.G.. : 2427 

2440 * 2460 

Seqidl : AAAAAT AC CTTGCGT CGTGAAAATT ACTAT : 244 5 
10 Seqid3 : TAC . . . CGACAG . . T . . .T-. . . : 2457 

* 2480 * 
Seqidl : GCCGTGAAAAAAACGGAAGATATTAAAAAA : 2 4 75 

15 Seqid3 : . „ TA . T G..A G : 2487 

2500 * 2520 
Seqidl : CAACCGATTATTTTAGATTTACACGTCAGC : 25 0 5 
20 Seqid3 : T : 2 517 

* 2540 * 
Seqidl : TATGAACCAATCAAAGATTTGATTATTAAA : 2 53 5 

25 Seqid3 : : 2547 

2560 * 2580 
Seqidl : GCGGAAGTACAAAATCTATTAGATAAACGT : 2 565 
30 Seqid3 : : 2577 



Seqidl 



* 2600 * 

TATGTTGAT CCGTTAGATGCTGGAAATGAC 



2595 



WO 00/50599 
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.T : 2607 



Seqidl 
Seqid3 



2620 * 2640 

GCGGCTTCGCAACGTTATTATTCAAGTTTA 



2625 
2637 



* 2660 * 

io Seqidl : AATAATTCTATAGAATGTGCGCAAGATTCT : 2655 
Seqid3 : ...G T . . . CC . . . AAAAT . A. . GAA : 2667 



2680 * 2700 

15 Seqidl : TCTGCTTGC GGTGGTTCAGATAAAAC C : 2682 

Seqid3 : . .AA. C. .TAAT. A G T : 2697 

* 2720 * 

20 Seqidl : GTGCTTTATAACTTTGCACGTGGAAGAACT : 2712 

Seqid3 : . 2727 

2740 * 2760 

25 Seqidl : TAT ATT CTGAGTTT AAACTAT AAATT CTAA : 274 2 

Seqid3 : G G : 2757 
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Figure 2 : Alignment of the BASB070 polypeptide sequences. 

Identity to SeqID No:2 is indicated by a dot and Gap is indicated by a dash. 

* 20 * 
5 Seqid2 : MKKAI KLNL I TLGL I NT I GMT I TQ AQAE ET : 3 0 
Seqid4 : S : 30 



40 * 60 
10 Seqid2 : LGQIDWEKVISNDKKPFTEAKAKSTRENV : 6 0 
Seqid4 : . 6 0 



* 80 * 
15 Seqid2 : FKETQTIDQVIRS I PGAFTQQDKGSGWS V : 90 
Seqid4 : . 90 



100 * 120 
20 Seqid2 : NI RGENGLGRVNTMVDGVTQTFYS TALDSG : 12 0 
Seqid4 : . 12 q 
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Seqid2 : QSGGSSQFGAAIDPNFIAGVDVNKSNFSGA : 15 0 

Seqid4 : . 15Q 

5 160 * 180 

Seqid2 : SGINALAGSANFRTLGVNDVITDDKPFGII : 180 

Seqid4 : S : 180 

10 * 2 00 * 

Seqid2 : LKGMTGSNATKSNFMTMAAGRKWLDNGGYV : 210 

Seqid4 : T : 210 

15 220 * 240 

Seqid2 : GWYGYS QREVSQD YR I GGGERLAS LGQD I : 24 0 

Seqid4 : 240 

20 * 260 * 

Seqid2 : LAKEKEAY FRNAG Y I LN P EGQ WT PD L S KKH : 270 

Seqid4 : KI...D S A . . . A . . . N . P . : 270 



WO 00/50599 
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Seqid.2 
Seqid4 

5 

Seqid2 
Seqid4 

10 

Seqid2 
Seqid4 

15 

Seqid2 
Seqid4 

20 



280 * 300 

WSCNKPDYQKN GDCSYYRIGSAAKTR : 2 96 

. . . .T.SSL. . KSMSTS . KP . . . . P . . T . . : 300 



* 320 * 
REILQELLTNGKKPKDIEKLQKGNDGIEET : 326 
Q. . .K. . .E. . .E S : 330 



340 * 360 
DKSFERNKDQYSVAPIEPGSLQSRSRSHLL : 3 56 
E D : 360 



* 380 * 
KFEYGDDHQNLGAQLRTLDNKIGSRKIENR : 3 86 
. . . .S . . .HT : 390 



400 * 420 

Seqid2 : N YQ VNYNFNNNS YLD LNLMAAHN I GKT I YP : 416 



WO 00/50599 PCT/EP00/01423 
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Seqid4 : : 420 

* 440 * 

5 Seqid2 : KGGFFAGWQVADKLITKNVANIVDINNSHT : 44 6 
Seqid4 : : 450 

460 * 480 
10 Seqid2 : FLLPKEIDLKTTLGFNYFTNEYSKNRFPEE : 4 76 
Seqid4 : : 480 

* 500 * 

15 Seqid2 : LSLFYNDASHDQGLYSHSKRGRYSGTKSLL : 5 06 
Seqid4 : V.E L.N G. . -. 510 

520 * 540 
20 Seqid2 : PQRSVILQPSGKQKFKTVYFDTALSKGIYH : 53 6 
Seqid4 : : 54 0 
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* 560 * 

Seqid2 : LNYSVNFTHYAFNGEYVGYENTAGQQINEP : 56 6 

Seqid4 : K. . . -DK : 569 

5 

580 * 600 

Seqid2 : ILHKSGHKKAFNHSATLSAELSDYFMPFFT : 5 96 

Seqid4 : "■ 599 

* 620 * 

Seqid2 : YSRTHRMPNIQEMFFSQVSNAGVNTALKPE : 626 

Seqid4 : : 629 

640 * 660 

Seqid2 : QSDTYQLGFNTYKKGLFTQDDVLGVKLVGY : 656 

Seqid4 : "• 659 

* 680 * 

Seqid2 : RS F I KNY IHNVYGVWWRDG - MPTWAESNGF : 685 
Seqid4 : D . S . . . V. . E . . RL . . . : 689 



15 



20 
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15 



700 * 720 
Seqid2 : KYTI AHQNYKP I VKKSGVELEINYDMGRFF : 715 
Seqid4 : .L Q A : 719 

* 740 * 
Seqid2 : ANVS YAYQRTNQPTNYADAS PRPNNASQED : 745 
Seqid4 : .S..R...K.E : 749 

760 * 780 
Seqid2 : ILKQGYGLSRVSMLPKDYGRLELGTRWFDQ : 775 
Seqid4 : : 77 9 

* 800 * 
Seqid2 : KLTLGLAARYYGKSKRATIEEEYINGSRFK : 805 
Seqid4 : T E : 809 



20 



Seqid2 



820 * 840 

KNTLRRENYYAVKKTED I KKQPI I LDLHVS : 83 5 



WO 00/50599 
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Seqid4 : . . .T.DRI 



PCT/EP00/01423 



839 



* 860 * 
5 Seqid2 : YE P I KDL 1 1 KAEVQNLLDKR YVDPLDAGND : 865 
Seqid4 : : 869 



880 * 900 

10 Seqid2 : AASQRYYSSLNNSIECAQDSSACG-GSDKT : 8 94 
Seqid4 : A.KI.E.T.ND..E.. : 899 



* 



15 Seqid2 
Seqid4 



VLYNFARGRTYILSLNYKF 



913 
918 



WO 00/50599 
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SEQUENCE LISTING 
<110> SmithKline Beecham Biologicals S.A. 

5 

<12 0> Novel Compounds 
<130> BM45376 

10 

<160> 4 

<170> FastSEQ for Windows Version 3.0 



<211> 2742 
c212> DNA 

<213> Haemophilus influenzae strain Rd KW20 

20 <400> 1 

atgaagaaag ctataaaatt aaatttaatt acacttggcc taattaatac gatcggtatg 60 

acgattacac aagctcaagc cgaagaaaca ttaggacaaa ttgatgtagt ggaaaaagtt 120 

atatcaaacg ataaaaaacc tttcactgaa gccaaagcca aaagtacacg tgaaaatgtc 180 

tttaaggaaa cacaaaccat tgaccaagtg attcgaagta tccctggtgc atttactcaa 240 

25 caagataaag gctcgggtgt cgtttctgtg aatattcgtg gcgaaaatgg attaggtcgt 300 

gtcaatacta tggttgatgg tgtaacacag accttctatt ctacagcctt agactcaggt 360 

caatcaggcg gaagttctca atttggtgcg gcaatcgatc ctaattttat tgcaggtgta 420 

gatgttaata aaagcaactt ttcaggagca agcggtataa atgcgttagc aggcagtgct 480 

aattttagaa cattaggcgt taatgatgtt attaccgatg acaaaccatt tggcattatt 540 

30 ctgaaaggaa tgacagggag taatgccact aaatccaatt ttatgacaat ggctgctggc 600 

agaaaatggc ttgataatgg tggctatgta ggcgtggtgt atggttatag ccaacgtgaa 660 

gtatctcaag attaccgtat cggtggcgga gaacgattag catcattagg gcaggatatt 720 

ctcgcgaaag aaaaagaagc ttattttcgt aatgcgggtt atattttaaa tcctgaaggg 780 

caatggacac ctgatttaag caaaaaacat tggtcttgta acaaaccaga ttatcagaaa 840 

35 aatggtgatt gtagttatta tcgtattgga tctgctgcaa agactagaag agaaattcta 900 

caagaattat taacaaatgg aaaaaaacct aaggatattg aaaagctcca aaaaggtaat 960 

gatggaattg aagaaactga caaatcattt gaacgtaata aagatcaata tagtgttgca 1020 

ccgattgagc cgggtagttt gcaatctcgt tctcgtagcc atttattaaa atttgaatat 1080 

ggcgatgatc accaaaattt aggggcgcaa ttacgcacgt tggataataa aattggttct 1140 

40 cgcaaaattg aaaaccgtaa ttaccaagtc aattataact tcaataataa cagctatctt 1200 



1 
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gatcttaatt 


taatggctgc 


acataacatt 


ggaaaaacta 


tttatcctaa 


aggcggtttt 


1260 




tttgctggct 


ggcaagtggc 


agataaactt 


atcactaaaa 


atgtcgcaaa 


tattgttgat 


1320 




ataaacaaca 


gccatacttt 


cttactgcca 


aaagaaattg 


atttaaaaac 


cacattaggt 


1380 




tttaactatt 


ttaccaatga 


atacagtaaa 


aaccgttttc 


cagaagaatt 


aagtttgttt 


1440 


5 


tataacgatg 


cttcacatga 


tcaaggctta 


tattcacaca 


gtaaaagagg 


gcgatattct 


1500 




ggcacaaaaa 


gtttattacc 


acaacgttca 


gtaatcttac 


aaccttctgg 


caagcaaaaa 


1560 




tttaaaaccg 


tgtattttga 


taccgcactt 


tctaaaggca 


tttatcattt 


aaattacagc 


1620 




gtgaatttta 


cccattatgc 


ctttaatggt 


gagtatgtag 


gttacgaaaa 


tacagcgggt 


1680 




caacaaatta 


atgaacctat 


tttgcataaa 


tcagggcata 


aaaaggcatt 


caatcattct 


1740 


10 


gccacattaa 


gtgcagaact 


gagtgattat 


tttatgccat 


tttttactta 


ttcacgcact 


1800 




cacagaatgc 


cgaatattca 


agagatgttt 


ttctctcaag 


tgtctaatgc 


aggggtaaac 


1860 




acagcattaa 


aacctgaaca 


atctgacacc 


tatcaactag 


gctttaatac 


ttataaaaaa 


1920 




ggtctcttca 


ctcaagacga 


tgtgctaggc 


gtaaaattag 


taggctatcg 


tagctttatt 


1980 




aaaaactata 


tccataatgt 


ttatggtgtt 


tggtggcgag 


atggcatgcc 


tacgtgggca 


2040 


15 


gaaaigtaatg 


gatttaaata 


tactattgct 


catcaaaatt 


ataagcctat 


tgtgaaaaag 


2100 




agcggcgtcg 


agttagaaat 


taactatgac 


atgggacgtt 


tttttgcgaa 


tgtctcttat 


2160 




gcatatcaac 


gaacaaatca 


accaaccaat 


tatgccgatg 


ccagcccgcg 


tccgaataat 


2220 




gcttcacaag 


aagacatttt 


gaaacaaggt 


tatggcttat 


ctcgtgtttc 


aatgctacca 


2280 




aaagactacg 


gcagattaga 


gcttggcaca 


cgttggtttg 


atcaaaaatt 


aaccttaggt 


2340 


20 


ctggcagctc 


gttattatgg 


aaaaagtaaa 


cgtgcgacaa 


ttgaagaaga 


atatatcaat 


2400 




ggatctcgct 


ttaaaaaaaa 


taccttgcgt 


cgtgaaaatt 


actatgccgt 


gaaaaaaacg 


2460 




gaagatatta 


aaaaacaacc 


gattatttta 


gatttacacg 


tcagctatga 


accaatcaaa 


2520 




gatttgatta 


ttaaagcgga 


agtacaaaat 


ctattagata 


aacgttatgt 


tgatccgtta 


2580 




gatgctggaa 


atgacgcggc 


ttcgcaacgt 


tattattcaa 


gtttaaataa 


ttctatagaa 


2640 


25 


tgtgcgcaag 


attcttctgc 


ttgcggtggt 


tcagataaaa 


ccgtgcttta 


taactttgca 


2700 




cgtggaagaa 


cttatattct 


gagtttaaac 


tataaattct 


aa 




2742 



<210> 2 

<211> 913 

30 <212> PRT 

<213> Haemophilus influenzae strain Rd KW20 

<400> 2 

Met Lys Lys Ala lie Lys Leu Asn Leu lie Thr Leu Gly Leu lie Asn 
35 1 5 10 15 

Thr lie Gly Met Thr lie Thr Gin Ala Gin Ala Glu Glu Thr Leu Gly 

20 25 30 

Gin lie Asp Val Val Glu Lys Val lie Ser Asn Asp Lys Lys Pro Phe 
35 40 45 

40 Thr Glu Ala Lys Ala Lys Ser Thr Arg Glu Asn Val Phe Lys Glu Thr 

50 55 60 

Gin Thr lie Asp Gin Val lie Arg Ser lie Pro Gly Ala Phe Thr Gin 
65 70 75 80 
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Gin 
Gly 
5 Tyr 
Gly 
Ser 

10 145 

Asn 

Phe 

15 Asn 
Tyr 
Tyr 

20 225 
Leu 

Asn 

25 Cys 
He 
Thr 

30 305 
Asp 

Tyr 

35 Ser 
Ala 
Asn 

40 385 
Asp 

Lys 



Asp Lys Gly 

Leu Gly Arg 
100 

Ser Thr Ala 

115 

Ala Ala He 
130 

Asn Phe Ser 

Phe Arg Thr 

Gly lie He 
180 

Phe Met Thr 

195 
Val Gly Val 
210 

Arg He Gly 

Ala Lys Glu 

Pro Glu Gly 
260 

Asn Lys Pro 

275 
Gly Ser Ala 
290 

Asn Gly Lys 

Gly He Glu 

Ser Val Ala 
340 

His Leu Leu 
355 

Gin Leu Arg 

370 

Arg Asn Tyr 
Leu Asn Leu 
Gly Gly Phe 



Ser Gly Val 
85 

Val Asn Thr 

Leu Asp Ser 

Asp Pro Asn 
135 

Gly Ala Ser 

150 
Leu Gly Val 
165 

Leu Lys Gly 

Met Ala Ala 

Val Tyr Gly 
215 

Gly Gly Glu 

230 
Lys Glu Ala 
245 

Gin Trp Thr 

Asp Tyr Gin 

Ala Lys Thr 
295 

Lys Pro Lys 

310 
Glu Thr Asp 
325 

Pro He Glu 

Lys Phe Glu 

Thr Leu Asp 
375 

Gin Val Asn 
390 

Met Ala Ala 
405 

Phe Ala Gly 



Val Ser Val 
90 

Met Val Asp 

105 
Gly Gin Ser 
120 

Phe He Ala 

Gly He Asn 

Asn Asp Val 
170 

Met Thr Gly 

185 
Gly Arg Lys 
200 

Tyr Ser Gin 

Arg Leu Ala 

Tyr Phe Arg 
250 

Pro Asp Leu 

265 
Lys Asn Gly 
280 

Arg Arg Glu 

Asp He Glu 

Lys Ser Phe 
330 

Pro Gly Ser 
345 

Tyr Gly Asp 
360 

Asn Lys He 

Tyr Asn Phe 

His Asn He 
410 

Trp Gin Val 



Asn He Arg 

Gly Val Thr 

Gly Gly Ser 
125 

Gly Val Asp 
140 

Ala Leu Ala 
155 

He Thr Asp 

Ser Asn Ala 

Trp Leu Asp 
205 

Arg Glu Val 

220 
Ser Leu Gly 
235 

Asn Ala Gly 

Ser Lys Lys 

Asp Cys Ser 
285 

He Leu Gin 

300 
Lys Leu Gin 
315 

Glu Arg Asn 

Leu Gin Ser 

Asp His Gin 
365 

Gly Ser Arg 
380 

Asn Asn Asn 
395 

Gly Lys Thr 
Ala Asp Lys 



Gly Glu Asn 

95 

Gin Thr Phe 
110 

Ser Gin Phe 

Val Asn Lys 

Gly Ser Ala 
160 

Asp Lys Pro 
175 

Thr Lys Ser 
190 

Asn Gly Gly 

Ser Gin Asp 

Gin Asp He 
240 

Tyr He Leu 

255 
His Trp Ser 
270 

Tyr Tyr Arg 

Glu Leu Leu 

Lys Gly Asn 
320 

Lys Asp Gin 

335 
Arg Ser Arg 
350 

Asn Leu Gly 

Lys He Glu 

Ser Tyr Leu 
400 

He Tyr Pro 
415 

Leu He Thr 
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420 425 430 

Lys Asn Val Ala Asn lie Val Asp lie Asn Asn Ser His Thr Phe Leu 

435 440 445 

Leu Pro Lys Glu lie Asp Leu Lys Thr Thr Leu Gly Phe Asn Tyr Phe 
5 450 455 460 

Thr Asn Glu Tyr Ser Lys Asn Arg Phe Pro Glu Glu Leu Ser Leu Phe 
465 470 475 480 

Tyr Asn Asp Ala Ser His Asp Gin Gly Leu Tyr Ser His Ser Lys Arg 
485 490 495 

10 Gly Arg Tyr Ser Gly Thr Lys Ser Leu Leu Pro Gin Arg Ser Val lie 
500 505 510 

Leu Gin Pro Ser Gly Lys Gin Lys Phe Lys Thr Val Tyr Phe Asp Thr 

515 520 525 

Ala Leu Ser Lys Gly lie Tyr His Leu Asn Tyr Ser Val Asn Phe Thr 
15 530 535 540 

His Tyr Ala Phe Asn Gly Glu Tyr Val Gly Tyr Glu Asn Thr Ala Gly 
545 550 555 560 

Gin Gin lie Asn Glu Pro lie Leu His Lys Ser Gly His Lys Lys Ala 
565 570 575 

20 Phe Asn His Ser Ala Thr Leu Ser Ala Glu Leu Ser Asp Tyr Phe Met 

580 585 590 

Pro Phe Phe Thr Tyr Ser Arg Thr His Arg Met Pro Asn He Gin Glu 

595 600 605 

Met Phe Phe Ser Gin Val Ser Asn Ala Gly Val Asn Thr Ala Leu Lys 
25 610 615 620 

Pro Glu Gin Ser Asp Thr Tyr Gin Leu Gly Phe Asn Thr Tyr Lys Lys 
625 630 635 640 

Gly Leu Phe Thr Gin Asp Asp Val Leu Gly Val Lys Leu Val Gly Tyr 
645 650 655 

30 Arg Ser Phe He Lys Asn Tyr He His Asn Val Tyr Gly Val Trp Trp 
660 665 670 

Arg Asp Gly Met Pro Thr Trp Ala Glu Ser Asn Gly Phe Lys Tyr Thr 

675 680 685 

He Ala His Gin Asn Tyr Lys Pro He Val Lys Lys Ser Gly Val Glu 
35 690 695 700 

Leu Glu He Asn Tyr Asp Met Gly Arg Phe Phe Ala Asn Val Ser Tyr 
705 710 715 720 

Ala Tyr Gin Arg Thr Asn Gin Pro Thr Asn Tyr Ala Asp Ala Ser Pro 
725 730 735 

40 Arg Pro Asn Asn Ala Ser Gin Glu Asp "lie Leu Lys Gin Gly Tyr Gly 
740 745 750 

Leu Ser Arg Val Ser Met Leu Pro Lys Asp Tyr Gly Arg Leu Glu Leu 
755 760 765 

4 
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Gly Thr Arg Trp Phe Asp Gin Lys Leu Thr Leu Gly Leu Ala Ala Arg 

770 775 780 

Tyr Tyr Gly Lys Ser Lys Arg Ala Thr lie Glu Glu Glu Tyr He Asn 
785 790 795 800 

5 Gly Ser Arg Phe Lys Lys Asn Thr Leu Arg Arg Glu Asn Tyr Tyr Ala 

805 810 815 

Val Lys Lys Thr Glu Asp He Lys Lys Gin Pro He He Leu Asp Leu 

820 825 830 

His Val Ser Tyr Glu Pro He Lys Asp Leu He He Lys Ala Glu Val 
10 835 840 845 

Gin Asn Leu Leu Asp Lys Arg Tyr Val Asp Pro Leu Asp Ala Gly Asn 

850 855 860 

Asp Ala Ala Ser Gin Arg Tyr Tyr Ser Ser Leu Asn Asn Ser He Glu 
865 870 875 880 

15 Cys Ala Gin Asp Ser Ser Ala Cys Gly Gly Ser Asp Lys Thr Val Leu 

885 890 895 

Tyr Asn Phe Ala Arg Gly Arg Thr Tyr He Leu Ser Leu Asn Tyr Lys 
900 905 910 

Phe 

20 



<210> 3 
<211> 2757 
<212> DNA 

<213> non typeable Haemophilus Influenzae strain 3224 



<400> 3 

atgaagaaag ctataaaatt aaatttaatt 
acgattacac aagctcaagc cgaagaaaca 

30 atatcaaatg acaaaaaacc tttcactgaa 
tttaaggaaa cacaaaccat tgaccaagtc 
caagataaag gctcgggtgt ggtttctgta 
gtcaatacga tggttgatgg tgtaacccaa 
caatcaggcg gaagttctca atttggtgcg 

35 gatgttaata aaagcaactt ttcgggagca 
aattttagaa cattaagcgt taatgatgtg 
ctgaaaggaa tgacagggag caatgccact 
agaaaatggc ttgataatgg tggctatgta 
gtttcacaag attatcgtat aggtggcgga 

40 cttgctaaag aaaaagaaaa gatttttcgt 

caatgggcac ctgatttaaa caaaccacat 
gataaaagta tgagtacatc ttgtaagcct 
caagaaattc taaaagaatt attagaagat 



acacttagcc 


taatcaatac 


aatcggtatg 


60 


ttagggcaaa 


ttgatgtcgt 


agaaaaagtg 


120 


gccaaagcca 


aaagtacgcg 


tgaaaatgtc 


180 


attcggagca 


ttcctggggc 


atttactcaa 


240 


aatattcgtg 


gcgaaaatgg 


attaggtcgt 


300 


accttctatt 


ctacagcctt 


agactctggt 


360 


gcaatcgacc 


ctaattttat 


tgcaggtgta 


420 


agcggtataa 


atgccttagc 


aggcagtgct 


480 


attaccgatg 


acaaaccatt 


cggcattatt 


540 


aaatccaatt 


ttatgacgac 


agctgcaggc 


600 


ggcgtagtgt 


atggttatag 


ccaacgtgaa 


660 


gaacgattag 


catcattagg 


gcaagatatt 


720 


aatgatggtt 


atgttttaaa 


ttctgctgga 


780 


tggtcttgta 


ataccccgag 


ttctttaaaa 


840 


tatcgtcttg 


gacctgctgc 


aacgactaga 


900 


ggaaaagaac 


ctaaggatat 


tgaaaagctc 


960 
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caaaaaagta atgatggaat tgaagaaact 
tatgacgtcg cccctattga gcctggtagt 
aaatttgaat atagcgatga tcaccatacg 
aaaattggtt ctcgcaaaat tgaaaaccgt 
5 aacagctatc ttgatcttaa tttaatggct 
aagggtggtt tttttgctgg ctggcaagtg 
aatattgttg atataaataa cagccatact 
accacattag ggtttaacta ttttaccaat 
ttaagtttgt tttatgtgaa tgaatcacat 

10 gggcgatatt ctggctcaaa aggtttatta 
ggcaagcaaa aatttaaaac agtgtatttt 
ttaaattaca gcgtgaattt tacccattat 
aatacagcag ataaaattaa tgaacctatt 
aatcattctg ctacattaag tgcagagcta 

15 tcacgcacac acagaatgcc gaatattcaa 
ggggtaaaca ccgcattaaa acctgaacaa 
tataaaaaag gtctattcac tcaagacgat 
agctttatta aaaactatat ccacaatgtg 
ccagagtggg caagactcaa tggttttcgt 

20 atagtgaaaa aaagcggagc tgagttagag 
aatctgtctt atgcttatca acgtactaac 
cgtccgcgta atgcttcaaa agaagagatt 
tctatgttac caaaggacta cggtagatta 
ttaactcttg gtatcgcagc ccgttactat 

25 gaatacatca acggctctcg ctatgaaaaa 
attaaaaaga cagaagagat taaaaaacaa 
gaaccaatca aagatttgat tattaaagcg 
gttgatccgt tagatgctgg aaatgatgcg 
gattctttag cctgtaaaat aaatgaatca 

30 ctttataact ttgcacgtgg aagaacttat 



gaaaaatcat ttgaacgtaa taaagatcaa 1020 

ttgcaatctc gttcacgtag tcatttatta 1080 

ctaggggcgc aaatacgtac ccttgataat 1140 

aattaccaag tcaattataa cttcaataat 1200 

gcacataaca ttggcaaaac tatttatcct 1260 

gcagacaaac ttatcacaaa aaatgtggca 1320 

ttcttactgc caaaagaaat cgatttaaaa 1380 

gaatacagta aaaaccgttt tccagaagaa 1440 

gatcaaggct tatattcact cagtaataaa 1500 

ccacaacgtt cagtaatctt acaaccttct 1560 

gataccgcac tttctaaagg tatttatcat 1620 

gcctttaatg gtgagtatgt tggatataaa 1680 

ttgcataaat cagggcataa aaaggcattc 1740 

agtgattatt ttatgccatt ttttacttat 1800 

gagatgtttt tctctcaagt gtctgatgct 1860 

tctgacacct atcaactagg ctttaatact 1920 

gtattaggca tcaaattagt gggctatcgt 1980 

tatggagatt ggtcacgaga tggtgttatg 2040 

ctgacgattg ctcatcaaaa ttatcaacca 2100 

ctcaattatg atatggggcg tttttttgca 2160 

cagccaacca attatgccga tgccagctca 2220 

ttgaaacaag gttatggttt atcacgaatc 2280 

gagcttggca cacgctggtt tgatcaaaaa 2340 

ggaaaaagta aacgtgctac aactcaagaa 2400 

aatactacgc gcgacagaat ttattatgct 2460 

cctattattt tagatttaca cgtcagctat 2520 

gaagtacaaa atctattaga taaacgttat 2580 

gcttcgcaac gttattattc aagtttaaat 264 0 

acctgtaatg atggttcaga gaaaactgtg 2700 

attctgagtt tgaactataa attctag 2757 



<210> 4 
<211> 918 
<212> PRT 

35 <213> non typeable Haemophilus Influenzae strain 3224 



<400> 4 

Met Lys Lys Ala lie Lys Leu Asn Leu lie Thr Leu Ser Leu lie Asn 
15 10 15 

40 Thr lie Gly Met Thr He Thr Gin Ala Gin Ala Glu Glu Thr Leu Gly 
20 25 30 

Gin He Asp Val Val Glu Lys Val He Ser Asn Asp Lys Lys Pro Phe 
35 40 45 

6 
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Thr Glu Ala Lys Ala Lys Ser Thr Arg Glu Asn Val Phe Lys Glu Thr 

50 55 60 

Gin Thr lie Asp Gin Val He Arg Ser He Pro Gly Ala Phe Thr Gin 
65 "70 75 80 

5 Gin Asp Lys Gly Ser Gly Val Val Ser Val Asn He Arg Gly Glu Asn 

85 90 95 

Gly Leu Gly Arg Val Asn Thr Met Val Asp Gly Val Thr Gin Thr Phe 

100 105 no 

Tyr Ser Thr Ala Leu Asp Ser Gly Gin Ser Gly Gly Ser Ser Gin Phe 
10 H5 120 125 

Gly Ala Ala He Asp Pro Asn Phe He Ala Gly Val Asp Val Asn Lys 

130 135 140 

Ser Asn Phe Ser Gly Ala Ser Gly He Asn Ala Leu Ala Gly Ser Ala 
i45 150 155 160 

15 Asn Phe Arg Thr Leu Ser Val Asn Asp Val He Thr Asp Asp Lys Pro 

165 170 175 

Phe Gly He He Leu Lys Gly Met Thr Gly Ser Asn Ala Thr Lys Ser 

180 185 190 

Asn Phe Met Thr Thr Ala Ala Gly Arg Lys Trp Leu Asp Asn Gly Gly 
20 195 200 205 

Tyr Val Gly Val Val Tyr Gly Tyr Ser Gin Arg Glu Val Ser Gin Asp 

210 215 220 

Tyr Arg He Gly Gly Gly Glu Arg Leu Ala Ser Leu Gly Gin Asp He 
225 230 235 240 

25 Leu Ala Lys Glu Lys Glu Lys He Phe Arg Asn Asp Gly Tyr Val Leu 

245 250 255 

Asn Ser Ala Gly Gin Trp Ala Pro Asp Leu Asn Lys Pro His Trp Ser 

260 265 270 

Cys Asn Thr Pro Ser Ser Leu Lys Asp Lys Ser Met Ser Thr Ser Cys 
30 275 280 285 

Lys Pro Tyr Arg Leu Gly Pro Ala Ala Thr Thr Arg Gin Glu He Leu 

290 295 300 

Lys Glu Leu Leu Glu Asp Gly Lys Glu Pro Lys Asp He Glu Lys Leu 
305 310 315 320 

35 Gin Lys Ser Asn Asp Gly He Glu Glu Thr Glu Lys Ser Phe Glu Arg 

325 330 335 

Asn Lys Asp Gin Tyr Asp Val Ala ■ Pro He Glu Pro Gly Ser Leu Gin 

340 345 350 

Ser Arg Ser Arg Ser His Leu Leu Lys Phe Glu Tyr Ser Asp Asp His 
40 355 360 365 

His Thr Leu Gly Ala Gin He Arg Thr Leu Asp Asn Lys He Gly Ser 

370 375 380 

Arg Lys He Glu Asn Arg Asn Tyr Gin Val Asn Tyr Asn Phe Asn Asn 

7 
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10 



20 



385 390 395 400 

Asn Ser Tyr Leu Asp Leu Asn Leu Met Ala Ala His Asn He Gly Lys 

405 410 415 

Thr He Tyr Pro Lys Gly Gly Phe Phe Ala Gly Trp Gin Val Ala Asp 

420 425 430 

Lys Leu He Thr Lys Asn Val Ala Asn He Val Asp He Asn Asn Ser 

435 440 445 

His Thr Phe Leu Leu Pro Lys Glu He Asp Leu Lys Thr Thr Leu Gly 

450 455 460 

Phe Asn Tyr Phe Thr Asn Glu Tyr Ser Lys Asn Arg Phe Pro Glu Glu 
465 470 475 480 

Leu Ser Leu Phe Tyr Val Asn Glu Ser His Asp Gin Gly Leu Tyr Ser 

485 490 495 

Leu Ser Asn Lys Gly Arg Tyr Ser Gly Ser Lys Gly Leu Leu Pro Gin 
!5 500 505 510 

Arg Ser Val lie Leu Gin Pro Ser Gly Lys Gin Lys Phe Lys Thr Val 

515 520 525 

Tyr Phe Asp Thr Ala Leu Ser Lys Gly lie Tyr His Leu Asn Tyr Ser 

530 535 540 

Val Asn Phe Thr His Tyr Ala Phe Asn Gly Glu Tyr Val Gly Tyr Lys 
5 45 550 555 560 

Asn Thr Ala Asp Lys He Asn Glu Pro lie Leu His Lys Ser Gly His 

565 570 575 

Lys Lys Ala Phe Asn His Ser Ala Thr Leu Ser Ala Glu Leu Ser Asp 
25 580 585 590 

Tyr Phe Met Pro Phe Phe Thr Tyr Ser Arg Thr His Arg Met Pro Asn 

595 600 605 

He Gin Glu Met Phe Phe Ser Gin Val Ser Asp Ala Gly Val Asn Thr 
610 615 620 

30 Ala Leu Lys Pro Glu Gin Ser Asp Thr Tyr Gin Leu Gly Phe Asn Thr 
625 630 635 640 

Tyr Lys Lys Gly Leu Phe Thr Gin Asp Asp Val Leu Gly He Lys Leu 

645 650 655 

Val Gly Tyr Arg Ser Phe He Lys Asn Tyr He His Asn Val Tyr Gly 
35 660 665 670 

Asp Trp Ser Arg Asp Gly Val Met Pro Glu Trp Ala Arg Leu Asn Gly 

675 680 685 

Phe Arg Leu Thr He Ala His Gin Asn Tyr Gin Pro lie Val Lys Lys 

690 695 700 

Ser Gly Ala Glu Leu Glu Leu Asn Tyr Asp Met Gly Arg Phe Phe Ala 
7 °5 710 715 720 

Asn Leu Ser Tyr Ala Tyr Gin Arg Thr Asn Gin Pro Thr Asn Tyr Ala 
725 730 735 

8 
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Asp Ala Ser Ser 
740 

Gin Gly Tyr Gly 
755 

5 Arg Leu Glu Leu 
770 

lie Ala Ala Arg 
785 

Glu Tyr lie Asn 

10 

lie Tyr Tyr Ala 
820 

He Leu Asp Leu 
835 

15 Lys Ala Glu Val 

850 

Asp Ala Gly Asn 
865 

Asp Ser Leu Ala 

20 

Glu Lys Thr Val 
900 

Ser Leu Asn Tyr 
915 



Arg Pro Arg Asn Ala Ser 
745 

Leu Ser Arg He Ser Met 
760 

Gly Thr Arg Trp Phe Asp 

775 

Tyr Tyr Gly Lys Ser Lys 
790 

Gly Ser Arg Tyr Glu Lys 
805 810 
He Lys Lys Thr Glu Glu 
825 

His Val Ser Tyr Glu Pro 
840 

Gin Asn Leu Leu Asp Lys 
855 

Asp Ala Ala Ser Gin Arg 
870 

Cys Lys He Asn Glu Ser 
885 890 
Leu Tyr Asn Phe Ala Arg 
905 

Lys Phe 



Lys Glu Glu He Leu Lys 
750 

Leu Pro Lys Asp Tyr Gly 
765 

Gin Lys Leu Thr Leu Gly 
780 

Arg Ala Thr Thr Gin Glu 
795 800 
Asn Thr Thr Arg Asp Arg 
815 

He Lys Lys Gin Pro He 
830 

He Lys Asp Leu He He 
845 

Arg Tyr Val Asp Pro Leu 
860 

Tyr Tyr Ser Ser Leu Asn 
875 880 
Thr Cys Asn Asp Gly Ser 
895 

Gly Arg Thr Tyr He Leu 
910 
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